1. Introduction and Aims

We have quality-controlled the 10X data and the SS2 data and now are left with the following objects:

10X 5K data - pb_sex_filtered

10X 30K data - pb_30k_sex_filtered

SS2 mutant data - ss2_mutants_final

2. Read in the data

Load/Install the Required Packages

[1] "patchwork is loaded correctly"
[1] "viridis is loaded correctly"
[1] "Seurat is loaded correctly"
Loading required package: cowplot

Attaching package: ‘cowplot’

The following object is masked from ‘package:patchwork’:

    align_plots
[1] "cowplot is loaded correctly"
Loading required package: gridExtra
[1] "gridExtra is loaded correctly"
Loading required package: grid
[1] "grid is loaded correctly"
Loading required package: Hmisc
Loading required package: lattice
Loading required package: survival
Loading required package: Formula
Loading required package: ggplot2

Attaching package: ‘Hmisc’

The following object is masked from ‘package:Seurat’:

    Key

The following objects are masked from ‘package:base’:

    format.pval, units
[1] "Hmisc is loaded correctly"
Loading required package: reshape2
[1] "reshape2 is loaded correctly"
Loading required package: dplyr

Attaching package: ‘dplyr’

The following objects are masked from ‘package:Hmisc’:

    src, summarize

The following object is masked from ‘package:gridExtra’:

    combine

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union
[1] "dplyr is loaded correctly"
Loading required package: Nebulosa
[1] "Nebulosa is loaded correctly"
Loading required package: monocle3
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply,
    parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:dplyr’:

    combine, intersect, setdiff, union

The following object is masked from ‘package:gridExtra’:

    combine

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq,
    Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax,
    pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.


Attaching package: ‘Biobase’

The following object is masked from ‘package:Hmisc’:

    contents

Loading required package: SingleCellExperiment
Loading required package: SummarizedExperiment
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:dplyr’:

    first, rename

The following object is masked from ‘package:base’:

    expand.grid

Loading required package: IRanges

Attaching package: ‘IRanges’

The following objects are masked from ‘package:dplyr’:

    collapse, desc, slice

Loading required package: GenomeInfoDb
Loading required package: DelayedArray
Loading required package: matrixStats

Attaching package: ‘matrixStats’

The following objects are masked from ‘package:Biobase’:

    anyMissing, rowMedians

The following object is masked from ‘package:dplyr’:

    count


Attaching package: ‘DelayedArray’

The following objects are masked from ‘package:matrixStats’:

    colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges

The following objects are masked from ‘package:base’:

    aperm, apply, rowsum


Attaching package: ‘SummarizedExperiment’

The following object is masked from ‘package:Seurat’:

    Assays


Attaching package: ‘monocle3’

The following objects are masked from ‘package:Biobase’:

    exprs, fData, fData<-, pData, pData<-
[1] "monocle3 is loaded correctly"

Read in the Data

screen hits

## EDIT - change this to the excel table once we have it finalized for the screen
screen_hits <- c("PBANKA-0516300",
"PBANKA-1217700",
"PBANKA-0409100",
"PBANKA-1034300",
"PBANKA-1437500",
"PBANKA-0827500",
"PBANKA-0824300",
"PBANKA-1426900",
"PBANKA-0105300",
"PBANKA-0921100",
"PBANKA-1002400",
"PBANKA-0829400",
"PBANKA-1347200",
"PBANKA-0828000",
"PBANKA-0902300",
"PBANKA-1418100",
"PBANKA-1435200",
"PBANKA-1454800",
"PBANKA-0712300",
"PBANKA-0410500",
"PBANKA-1144800",
"PBANKA-1231600",
"PBANKA-0503200",
"PBANKA-0308900",
"PBANKA-1214700",
"PBANKA-0709900",
"PBANKA-0311900",
"PBANKA-0716500",
"PBANKA-1447900",
"PBANKA-0102200",
"PBANKA-0713500",
"PBANKA-0102400",
"PBANKA-1302700",
"PBANKA-1235900",
"PBANKA-0401100",
"PBANKA-0413400",
"PBANKA-1126900",
"PBANKA-1425900",
"PBANKA-0418300",
"PBANKA-1464600",
"PBANKA-0806000")

Read in gene annotations

gene_annotations <- read.table("../data/Reference/GenesByTaxon_Summary.csv", header = TRUE, sep = ",", stringsAsFactors = TRUE)
dim(gene_annotations)

## convert _ to -
gene_annotations$Gene.ID <- gsub("_", "-", gene_annotations$Gene.ID)

load in datasets

paste("10x dataset")
[1] "10x dataset"
pb_sex_filtered
An object of class Seurat 
5098 features across 6191 samples within 1 assay 
Active assay: RNA (5098 features, 2000 variable features)
 2 dimensional reductions calculated: pca, umap
paste("Smart-seq2 dataset")
[1] "Smart-seq2 dataset"
ss2_mutants_final
An object of class Seurat 
5245 features across 2717 samples within 1 assay 
Active assay: RNA (5245 features, 2000 variable features)
 2 dimensional reductions calculated: pca, umap
paste("The composition of the Smart-seq2 dataset is:")
[1] "The composition of the Smart-seq2 dataset is:"
table(ss2_mutants_final@meta.data$genotype)

Mutant     WT 
  2028    689 

3. Merging the Smart-seq2 and 10X Data

Prepare data

tenx_5k_counts_to_integrate
An object of class Seurat 
5098 features across 6191 samples within 1 assay 
Active assay: RNA (5098 features, 0 variable features)

We need to make sure the mutant data is compatible with the 10X data. the 10X data has fewer genes represented so we need to find the intersect of the two before integration.

GCSKO_mutants
An object of class Seurat 
5018 features across 2717 samples within 1 assay 
Active assay: RNA (5018 features, 0 variable features)
## double check that this is the same number of genes
## subset counts so that only genes represented in the other two objects are there:
length(intersect(rownames(tenx_5k_counts), rownames(mutant_counts_for_integration)))
[1] 5018

IMPORTANT - this next step is different to GCSKO_merge as it subsets the smart-seq2 data into wild-type only.

## subset wt on ss2 data:
ss2_wt_cells <- rownames(GCSKO_mutants@meta.data[GCSKO_mutants@meta.data$genotype == "WT", ])
GCSKO_mutants_wtonly <- subset(GCSKO_mutants, cells = ss2_wt_cells)

create list and normalise:

## make list
tenx.justwt.list <- list(tenx_5k_counts_to_integrate, GCSKO_mutants_wtonly)

## prepare data
for (i in 1:length(tenx.justwt.list)) {
    tenx.justwt.list[[i]] <- NormalizeData(tenx.justwt.list[[i]], verbose = FALSE)
    tenx.justwt.list[[i]] <- FindVariableFeatures(tenx.justwt.list[[i]], selection.method = "vst", 
        nfeatures = 2000, verbose = FALSE)
    all.genes <- rownames(tenx.justwt.list[[i]])
    tenx.justwt.list[[i]] <- ScaleData(tenx.justwt.list[[i]], features = all.genes)
}

Integrate objects

## Find anchors
tenx.justwt.anchors <- FindIntegrationAnchors(object.list = tenx.justwt.list, dims = 1:21, verbose = FALSE)

## Integrate data
tenx.justwt.integrated <- IntegrateData(anchorset = tenx.justwt.anchors, dims = 1:21, verbose = FALSE, features.to.integrate = genes_in_tenx_dataset)

4. Dimensionality reduction

PCA

Optimised UMAP

After optimisation, the following UMAP can be calculated:

## Run optimised UMAP
tenx.justwt.integrated <- RunUMAP(tenx.justwt.integrated, reduction = "pca", dims = 1:10, n.neighbors = 50, seed.use = 1234, min.dist = 0.4, repulsion.strength = 0.03, local.connectivity = 150)

Make final plots:


## split seurat object up
ob.list <- SplitObject(tenx.justwt.integrated, split.by = "experiment")

## make plots for each object
plot.list <- lapply(X = ob.list, FUN = function(x) {
    DimPlot(x, dims = c(1,2), pt.size = 1) + theme(legend.position="bottom")
})

composition_umap_10x <- plot.list$`tenx_5k` + 
  coord_fixed() +
  theme_void() +
  scale_color_manual(values=c(replicate(45, "#999999"))) +
  labs(title = paste("10x (wild-type)")) +
  theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))

composition_umap_ss2 <- plot.list$mutants +
  coord_fixed() +
  theme_void() +
  scale_color_manual(values=c(replicate(46, "#999999"))) +
  labs(title = paste("Smart-seq2 (wild-type)")) +
  theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))

composition_umap <- composition_umap_10x + composition_umap_ss2 

composition_umap

save

ggsave("../images_to_export/composition_umap.png", plot = composition_umap, device = "png", path = NULL, scale = 1, width = 20, height = 20, units = "cm", dpi = 300, limitsize = TRUE)
## make plots
## hoo dataset correlation
UMAP_hoo <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "Prediction.Spearman.", dims = c(1,2)) +
  coord_fixed() + 
  theme_void() +
  labs(title = paste("Hoo Predicted Timepoint")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
  scale_colour_manual(values = inferno(12))  +
  labs(colour = "hour") +
  theme(legend.position = "bottom", legend.title=element_text(size=10))

## ap2g timecourse in this paper correlation
UMAP_kasia <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "Prediction.Spearman._Kasia", dims = c(1,2)) +
  coord_fixed() + 
  theme_void() +
  labs(title = paste("AP2G Timecourse Predicted Timepoint")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
  scale_colour_manual(values = inferno(10))  +
  labs(colour = "hour") +
  theme(legend.position = "bottom", legend.title=element_text(size=10))

## combine
umap_bulk <- wrap_plots(UMAP_hoo, UMAP_kasia, ncol = 2)

## print
umap_bulk

5. Clustering

Generate clusters

## generate new clusters at low resolution
tenx.justwt.integrated <- FindNeighbors(tenx.justwt.integrated, dims = 1:19, reduction = "pca")
tenx.justwt.integrated <- FindClusters(tenx.justwt.integrated, resolution = 2, random.seed = 42, algorithm = 2)

visualise clusters

DimPlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE, repel = FALSE, label = TRUE)

Make individual plots highlighting where cells in each cluster fall

[1] 24

plot

## this function writes the next bit of code for you
## put it into the console and paste the response
#ploty <- c()
#for(i in seq_along(levels(tenx.justwt.integrated@meta.data$seurat_clusters))){
#  ploty <- paste0(ploty, "list_UMAPs_by_cluster[[", i, "]]", " + ")
#}

## plot
list_UMAPs_by_cluster[[1]] + list_UMAPs_by_cluster[[2]] + list_UMAPs_by_cluster[[3]] + list_UMAPs_by_cluster[[4]] + list_UMAPs_by_cluster[[5]] + list_UMAPs_by_cluster[[6]] + list_UMAPs_by_cluster[[7]] + list_UMAPs_by_cluster[[8]] + list_UMAPs_by_cluster[[9]] + list_UMAPs_by_cluster[[10]] + list_UMAPs_by_cluster[[11]] + list_UMAPs_by_cluster[[12]] + list_UMAPs_by_cluster[[13]] + list_UMAPs_by_cluster[[14]] + list_UMAPs_by_cluster[[15]] + list_UMAPs_by_cluster[[16]] + list_UMAPs_by_cluster[[17]] + list_UMAPs_by_cluster[[18]] + list_UMAPs_by_cluster[[19]] + list_UMAPs_by_cluster[[20]] + list_UMAPs_by_cluster[[21]] + list_UMAPs_by_cluster[[22]] + list_UMAPs_by_cluster[[23]] + list_UMAPs_by_cluster[[24]]

Find Markers

visualise markers

## get the top 10 for each cluster
top10 <- pb.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_logFC)

## scale data is missing - but original dfs have scale.data
# df_scaled <- as.matrix(transform(merge(as.data.frame(tenx.justwt.list[[1]]@assays$RNA@scale.data), as.data.frame(tenx.justwt.list[[2]]@assays$RNA@scale.data), by = 0, all=FALSE), row.names=Row.names, Row.names=NULL))
# tenx.justwt.integrated@assays$RNA@scale.data <- df_scaled

## scale data in RNA slot
all.genes <- rownames(tenx.justwt.integrated)
tenx.justwt.integrated <- ScaleData(tenx.justwt.integrated, vars.to.regress = 'experiment', features = all.genes)

## heatmap
DoHeatmap(tenx.justwt.integrated, features = top10$gene) + NoLegend()

6. Define Cluster Identities

Downsampling

# ## check number of cells in each cluster
# df_clusters <- as.data.frame(table(tenx.justwt.integrated@meta.data$integrated_snn_res.2))
# plot(df_clusters)
# 
# ## downsample using appropriate metrics
# tenx.justwt.integrated.downsampled <- subset(tenx.justwt.integrated, downsample = 100)
#   
# ## inspect plot
# FeaturePlot(tenx.justwt.integrated.downsampled, features = "PBANKA-1437500", coord.fixed = TRUE, dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
#   theme_void() + 
#   labs(title = paste("AP2G (Commitment)")) + 
#   theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
#   scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

Marker Genes

useful tools for all plots

## define male and female symbol
female_symbol <- intToUtf8(9792)
male_symbol <- intToUtf8(9794)

## define colour pal
marker_gene_ramp <- colorRampPalette(c("#D3D3D3", "#1D1564"))(50)

Expression - cutoffs - purple


marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1319500", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(ccp2)~(Female))) + 
                          theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=6.0), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, 
                                    features = "PBANKA-0416100", 
                                    coord.fixed = TRUE, 
                                    min.cutoff = "q05", 
                                    max.cutoff = "q95", 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') +
                        coord_fixed() + 
                        theme_void() + 
                        labs(title = expression(italic(mg1)~(Male))) + 
                        theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                        scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1437500", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(ap2g)~(Commitment))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-0831000", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(msp1)~(Schizont))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1102200", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(msp8)~(Asexual))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Could not find PBANKA-1102200 in the default search locations, found in RNA assay insteadCoordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1101300", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE,
                                     slot = 'scale.data') +
                          coord_fixed() + 
                          theme_void() + 
                          labs(title = expression(italic(SBP1)~(Ring))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-0722600", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE,
                                     slot = 'scale.data') +
                          coord_fixed() + 
                          theme_void() + 
                          labs(title = expression(italic(fam-b2)~(Ring))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Could not find PBANKA-0722600 in the default search locations, found in RNA assay insteadCoordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_ORC1 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-0602000", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE,
                                     slot = 'scale.data') +
                          coord_fixed() + 
                          theme_void() + 
                          labs(title = expression(italic(orc1)~(DNA~replication))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Could not find PBANKA-0602000 in the default search locations, found in RNA assay insteadCoordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_CDPK5 <- FeaturePlot(tenx.justwt.integrated, 
                                      features = "PBANKA-1351500", 
                                      coord.fixed = TRUE, 
                                      min.cutoff = "q05", 
                                      max.cutoff = "q95", 
                                      dims = c(1,2), 
                                      reduction = "umap", 
                                      pt.size = 0.5, 
                                      order = TRUE, 
                                      slot = 'scale.data') +
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(cdpk5)~(Line~Schizont))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_MCM4 <- FeaturePlot(tenx.justwt.integrated, 
                                      features = "PBANKA-1415600", 
                                      coord.fixed = TRUE, 
                                      min.cutoff = "q05", 
                                      max.cutoff = "q95", 
                                      dims = c(1,2), 
                                      reduction = "umap", 
                                      pt.size = 0.5, 
                                      order = TRUE, 
                                      slot = 'scale.data') +
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(mcm4)~(DNA~replication))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
# + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)

marker_gene_plot_all

Mutant Genes

Expression - with cutoffs

# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD4 
# PBANKA-0716500        GCSKO-19  MD5 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD3
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5


marker_gene_plot_fd3 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-1418100", 
                                   coord.fixed = TRUE,
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95", 
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                        theme_void() + 
                        labs(title = expression(italic(fd3))) + 
                        theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Could not find PBANKA-1418100 in the default search locations, found in RNA assay insteadScale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                        ## add sex symbols
                        #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                        #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md4 <- FeaturePlot(tenx.justwt.integrated, 
                                  dims = c(1,2), 
                                  reduction = "umap", 
                                  features = "PBANKA-0102400", 
                                  coord.fixed = TRUE, 
                                  min.cutoff = "q05",
                                  max.cutoff = "q95", 
                                  pt.size = 0.5, 
                                  order = TRUE, 
                                  slot = 'scale.data') + 
                      theme_void() + 
                      labs(title = expression(italic(md4))) + 
                      theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                      ## add sex symbols
                      #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                      #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md5 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-0716500", 
                                   coord.fixed = TRUE, 
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95",
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md5))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = "")) 
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_fd4 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-1435200", 
                                   coord.fixed = TRUE, 
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95", 
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(fd4))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = "")) 
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_fd2 <- FeaturePlot(tenx.justwt.integrated, 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    features = "PBANKA-0902300", 
                                    coord.fixed = TRUE, 
                                    min.cutoff = "q05", 
                                    max.cutoff = "q95", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(fd2))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                       ## add sex symbols
                       # + annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md3 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-0413400", 
                                   coord.fixed = TRUE, 
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95",
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md3))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Could not find PBANKA-0413400 in the default search locations, found in RNA assay insteadScale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                       ## add sex symbols
                      #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_gd1 <- FeaturePlot(tenx.justwt.integrated, 
                                  dims = c(1,2), 
                                  reduction = "umap",
                                  features = "PBANKA-0828000", 
                                  coord.fixed = TRUE, 
                                  min.cutoff = "q05", 
                                  max.cutoff = "q95",
                                  pt.size = 0.5, 
                                  order = TRUE, 
                                  slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(gd1))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Could not find PBANKA-0828000 in the default search locations, found in RNA assay insteadScale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md1 <- FeaturePlot(tenx.justwt.integrated, 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    features = "PBANKA-1302700",
                                    coord.fixed = TRUE,
                                    min.cutoff = "q05", 
                                    max.cutoff = "q95", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md1))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md2 <- FeaturePlot(tenx.justwt.integrated,
                                   dims = c(1,2), 
                                   reduction = "umap",
                                   features = "PBANKA-1447900",
                                   coord.fixed = TRUE,
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95", 
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md2))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_fd1 <- FeaturePlot(tenx.justwt.integrated, 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    features = "PBANKA-1454800", 
                                    coord.fixed = TRUE, 
                                    min.cutoff = "q05",
                                    max.cutoff = "q95", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(fd1))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))
# scale_color_continuous_sequential(palette = "Purples 2")
# scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_gd1 , marker_gene_plot_md1 , marker_gene_plot_md2 , marker_gene_plot_md3 , marker_gene_plot_md4 , marker_gene_plot_md5 , marker_gene_plot_fd1 , marker_gene_plot_fd2 , marker_gene_plot_fd3 , marker_gene_plot_fd4, ncol = 4)
           
## print
mutant_expression_composite

save

ggsave("../images_to_export/ALL_CELLS_wt_gene_expression.png", plot = mutant_expression_composite, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD5
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5


marker_gene_plot_17 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1418100", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_2 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0102400", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_19 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0716500", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_20 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1435200", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_13 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0902300", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_10 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0413400", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md5")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_3 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0828000", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("gd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_oom <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1302700", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_29 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1447900", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_21 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1454800", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))

## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_17 , marker_gene_plot_2 , marker_gene_plot_19 , marker_gene_plot_20 , marker_gene_plot_13 , marker_gene_plot_10 , marker_gene_plot_3 , marker_gene_plot_oom , marker_gene_plot_29 , marker_gene_plot_21 , ncol = 4)
           
## print
mutant_expression_composite

7. Pseudotime on all cells

Pseudotime calculation

save

ggsave("../images_to_export/UMAP_pt_wt.png", plot = umap_pt, device = "png", path = NULL, scale = 1, width = 20, height = 20, units = "cm", dpi = 300, limitsize = TRUE)

gganimnate GIF of pseuodtime

#install.packages("gganimate")
library(gganimate)
#install.packages("gifski")
#install.packages("av")
#library(gifski)
#library(av)

## make dataframe for plotting
## extract data for GGplot version of this
df_animation <- as.data.frame(monocle.object.all@int_colData@listData$reducedDims@listData[["UMAP"]])
## add pt to this data frame:
pt_values <- as.data.frame(pseudotime(monocle.object.all, reduction_method = "UMAP"))
df_animation <- merge(df_animation, pt_values, by="row.names") 
rownames(df_animation) <- df_animation$Row.names
colnames(df_animation)[4] <- "pt"

## make the static plot
p <- ggplot(df_animation, aes(x = UMAP_1, y = UMAP_2, colour = pt)) +
  geom_point() +
  scale_colour_viridis_c(option = "plasma") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none")
## view plot
plot(p)

## make animated plot
## make a category for animation
#df_animation$group <- cut(df_animation$pt, 15)

anim <- p +
  transition_time(pt) +
  shadow_mark()

animate(anim, height = 3, width = 3, units = "in", res = 150, bg = 'transparent')

## to change the resolution - https://stackoverflow.com/questions/49058567/define-size-for-gif-created-by-gganimate-change-dimension-resolution 

Save animation

anim_save("animated_UMAP_transparent_bg_wt.gif", path = "../images_to_export/")
## extract pt values
pt_values <- as.data.frame(pseudotime(monocle.object.all, reduction_method = "UMAP"))

tenx.justwt.integrated <- AddMetaData(tenx.justwt.integrated, pt_values, "old_pt_values")

Define cell identities with branches

Define identities of cells

male

monocle.object_male <- choose_graph_segments(monocle.object.all)

female

monocle.object_female <- choose_graph_segments(monocle.object.all)

bipotential

monocle.object_bipot <- choose_graph_segments(monocle.object.all)

asexual (pre-branch)

monocle.object_asex_pre <- choose_graph_segments(monocle.object.all)

asexual fate

monocle.object_asex_fate <- choose_graph_segments(monocle.object.all)

check

df_freq <- data.frame(table(c(colnames(monocle.object_male), colnames(monocle.object_female), colnames(monocle.object_bipot), colnames(monocle.object_asex_pre), colnames(monocle.object_asex_fate))))
paste("number of cells in seurat object is", length(colnames(monocle.object.all)), ". The number of cells selected here with an identitity is", dim(df_freq)[1])
df_freq <- df_freq[df_freq$Freq > 1, ]
df_freq

Inspect where these missing cells are:

# '%ni%' <- Negate('%in%')
# 
# not_assigned_cells <- colnames(monocle.object.all)[colnames(monocle.object.all) %ni% c(colnames(monocle.object_male), colnames(monocle.object_female), colnames(monocle.object_bipot), colnames(monocle.object_asex), colnames(monocle.object_asex_fate))]
# 
# DimPlot(seurat.object, repel = TRUE, label.size = 5, pt.size = 0.5, cells.highlight = not_assigned_cells, dims = c(2,1), reduction = "DIM_UMAP") +
#   coord_fixed() + 
#   scale_color_manual(values=c("#000000", "#f54e1e"))
## create annotation dataframe from these results:
df_monocle_sexes <- rbind(data.frame("cell_name" = colnames(monocle.object_male), "sex" = rep("Male", length(colnames(monocle.object_male)))),
                          data.frame("cell_name" = colnames(monocle.object_female), "sex" = rep("Female", length(colnames(monocle.object_female)))),
                          data.frame("cell_name" = colnames(monocle.object_bipot), "sex" = rep("Bipotential", length(colnames(monocle.object_bipot)))),
                          data.frame("cell_name" = colnames(monocle.object_asex_pre), "sex" = rep("Asexual_Early", length(colnames(monocle.object_asex_pre)))),
                          data.frame("cell_name" = colnames(monocle.object_asex_fate), "sex" = rep("Asexual_Late", length(colnames(monocle.object_asex_fate))))
                          #data.frame("cell_name" = not_assigned_cells, "sex" = rep("Unassigned", length(not_assigned_cells)))
                          )

dim(df_monocle_sexes)

## order like the metadata
df_monocle_sexes <- df_monocle_sexes[match(rownames(monocle.object.all@colData), df_monocle_sexes$cell_name), ]

## add this back into the monocle object
monocle.object.all@colData$Sexes_monocle <- df_monocle_sexes$sex

## add this to the seurat object
rownames(df_monocle_sexes) <- df_monocle_sexes$cell_name
df_monocle_sexes_to_add_to_seurat <- df_monocle_sexes[,c("sex"), drop = FALSE]
tenx.justwt.integrated <- AddMetaData(tenx.justwt.integrated, df_monocle_sexes_to_add_to_seurat, col.name = "monocle_sex")

pt vs. real time

gg <- ggplot(df_plotting, aes(as.factor(Prediction.Spearman.), fill = monocle_sex)) + geom_bar(stat = "count", position="fill")

gg

8. Plots

make composite pseudotime/ID figure

save

#ggsave("../images_to_export/umap_id_pt.png", plot = umap_id_pt, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)

cluster plot

## these get defined later on, but are replicated above here for plotting
asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11) 
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)

## make a new column for plotting
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- tenx.justwt.integrated@meta.data$seurat_clusters

## reorder the levels so you can plot the cluters as you wish
my_levels <- c(asexual_early_clusters,asexual_late_clusters, bipotential_clusters, male_clusters, female_clusters)
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- factor(x = tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting, levels = my_levels)

## change the levels
levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting) <- list(A_1="9", 
                                      A_2="4", 
                                      A_3="15", 
                                      A_4 = "8", 
                                      A_5 = "1",
                                      A_6=  "14", 
                                      A_7 = "2", 
                                      A_8 = "10",
                                      A_9 = "3",
                                      A_10 = "0",
                                      A_11 = "6",
                                      A_12 = "5",
                                      A_13 = "7",
                                      A_14 = "12",
                                      A_15 = "18",
                                      A_16 = "20",
                                      A_17 = "23",
                                      P = "11",
                                      M_1 = "16",
                                      M_2 = "13",
                                      F_1 = "21",
                                      F_2 = "22",
                                      F_3 = "19",
                                      F_3 = "17"
                                      )

### Annotation set-up

## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex", "seurat_clusters_dot_plotting", "Prediction.Spearman.", "Prediction.Spearman._Kasia")]

## make a new column
df_pt_id$colour <- NA

## assign bins to each of the values
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values 
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))

## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))

## assign values to each cluster
## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "colour"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
## BECAUSE we have ordered the clusters already, we can simply take the row index for this bit
df_annotation$colour <- NA
df_annotation[1:17, ]$colour <- asex_ramp(100)[df_annotation[1:17, ]$x]
df_annotation[18, ]$colour <- bipot_ramp(100)[df_annotation[18, ]$x]
df_annotation[19:20, ]$colour <- male_ramp(100)[df_annotation[19:20, ]$x]
df_annotation[21:23, ]$colour <- female_ramp(100)[df_annotation[21:23, ]$x]

## make colour pallete
#pal_plot <- c(asex_ramp(170)[1,11,21,31,], bipot_ramp(3)[2], male_ramp(5)[3:4], female_ramp(5)[2:5])
pal_plot <- c(df_annotation$colour, '#49C16DFF')

## plot
umap_with_clusters <- DimPlot(tenx.justwt.integrated,
        group.by = "seurat_clusters_dot_plotting",
        dims = c(1,2), 
        reduction = "umap", 
        pt.size = 1,
        label.box = TRUE,
        label.size = 8,
        label.color = c(rep(c("#000000"), 12), rep(c("#ffffff"), 5), rep(c("#000000"), 2),"#ffffff", "#000000", "#ffffff", "#ffffff"),
        order = TRUE, 
        repel = TRUE, 
        label = TRUE,
        cols = pal_plot) +
  scale_color_manual(labels = c("Asexual 1", "Asexual 2" , "Asexual 3", "Asexual 4", "Asexual 5", "Asexual 6", "Asexual 7", "Asexual 8",  "Asexual 9", "Asexual 10", "Asexual 11", "Asexual 12", "Asexual 13", "Asexual 14", "Asexual 15", "Asexual 16", "Asexual 17", "Progenitor", "Male 1", "Male 2", "Female 1", "Female 2", "Female 3"), values = pal_plot) +
  theme_void() +
                     coord_fixed()
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
umap_with_clusters

save

ggsave("../images_to_export/umap_with_clusters.png", plot = umap_with_clusters, device = "png", path = NULL, scale = 1, width = 21, height = 29.5, units = "cm", dpi = 300, limitsize = TRUE)

dotplot

A dotplot allows us to look at the expression of multiple genes in a clearer way than succesive UMAP plots

### Data set-up

# PBANKA-0828000         GCSKO-3  GD1

# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-0413400    GCSKO-10_820  MD3
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-0102400         GCSKO-2  MD5 

# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-1435200        GCSKO-20  FD4 

# PBANKA-1437500 - AP2G - commitment
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)

marker_genes_list <- c("PBANKA-1437500", "PBANKA-1319500", "PBANKA-0416100", "PBANKA-0831000", "PBANKA-1102200")
mutant_genes_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0413400", "PBANKA-0716500", "PBANKA-0102400",  "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

## these get defined later on, but are replicated above here for plotting
asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11) 
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)

## copy the clusters so you don't permanently edit the master
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- tenx.justwt.integrated@meta.data$seurat_clusters

## reorder the levels so you can plot the cluters as you wish
my_levels <- c(asexual_early_clusters,asexual_late_clusters, bipotential_clusters, male_clusters, female_clusters)

## reorder the levels
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- factor(x = tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting, levels = my_levels)

## rename clusters so that they are intuitive
# this trick is from here: http://www.cookbook-r.com/Manipulating_data/Renaming_levels_of_a_factor/
levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting) <- list(Asexual_1="9", 
                                                                              Asexual_2="4", 
                                                                              Asexual_3="15", 
                                                                              Asexual_4 = "8", 
                                                                              Asexual_5 = "1",
                                                                              Asexual_6=  "14", 
                                                                              Asexual_7 = "2", 
                                                                              Asexual_8 = "10",
                                                                              Asexual_9 = "3",
                                                                              Asexual_10 = "0",
                                                                              Asexual_11 = "6",
                                                                              Asexual_12 = "5",
                                                                              Asexual_13 = "7",
                                                                              Asexual_14 = "12",
                                                                              Asexual_15 = "18",
                                                                              Asexual_16 = "20",
                                                                              Asexual_17 = "23",
                                                                              Bipotential = "11",
                                                                              Male_1 = "16",
                                                                              Male_2 = "13",
                                                                              Female_1 = "21",
                                                                              Female_2 = "22",
                                                                              Female_3 = "19",
                                                                              Female_3 = "17"
                                                                              )

### Annotation set-up

## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex", "seurat_clusters_dot_plotting", "Prediction.Spearman.", "Prediction.Spearman._Kasia")]

## make a new column
df_pt_id$colour <- NA

## assign bins to each of the values
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values 
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))

## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))

## assign values to each cluster
## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "colour"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
## BECAUSE we have ordered the clusters already, we can simply take the row index for this bit
df_annotation$colour <- NA
df_annotation[1:17, ]$colour <- asex_ramp(100)[df_annotation[1:17, ]$x]
df_annotation[18, ]$colour <- bipot_ramp(100)[df_annotation[18, ]$x]
df_annotation[19:20, ]$colour <- male_ramp(100)[df_annotation[19:20, ]$x]
df_annotation[21:23, ]$colour <- female_ramp(100)[df_annotation[21:23, ]$x]

## plot annotation 
## this really helped: https://www.biostars.org/p/396810/
h2 <- ggplot(df_annotation)+
  geom_bar(mapping = aes(x = Group.1, y = 1), 
           stat = "identity",
           fill = df_annotation$colour,
           col = "#FFFFFF",
           width = 1)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))#+
     #facet_grid(.~colour, scales = "free_x")
     #legend <- plot_grid(get_legend(h2), get_legend(h1), ncol = 1)
 h2 <- h2 + theme(legend.position = "none")
 #   geom_point(col = df_umap_plot$colour)
 
 h1 <- ggplot(df_annotation)+
  geom_point(mapping = aes(x = Group.1, y = 1),
            col = df_annotation$colour,
            size = 5)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))#+
     #facet_grid(.~colour, scales = "free_x")
     #legend <- plot_grid(get_legend(h2), get_legend(h1), ncol = 1)
 h1 <- h1 + theme(legend.position = "none")
 #   geom_point(col = df_umap_plot$colour)

 ## add predicted time point
 ## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "Prediction.Spearman."], list(df_pt_id$seurat_clusters_dot_plotting), mean)
df_annotation$colour <- NA
df_annotation$colour <- viridis(300)[(df_annotation$x)*10]

h3 <- ggplot(df_annotation)+
  geom_bar(mapping = aes(x = Group.1, y = 1), 
           stat = "identity",
           fill = df_annotation$colour,
           col = "#FFFFFF",
           width = 1)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))#+
     #facet_grid(.~colour, scales = "free_x")
     legend_h3 <- get_legend(h3)
 h3 <- h3 + theme(legend.position = "none")

  h3 <- ggplot(data = df_annotation, 
              mapping = aes(x = Group.1, fill=x)
              ) +
       geom_bar(col = "#FFFFFF", width = 1)+
       theme_void() +
       theme(panel.spacing.x = unit(1, "mm")) +
       scale_fill_viridis(option = 'cividis')
 legend_h3 <- get_legend(h3)
 h3 <- h3 + theme(legend.position = "none")
 
 ## add kasia data
  ## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "Prediction.Spearman._Kasia"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
df_annotation$colour <- NA
df_annotation$colour <- viridis(300)[(df_annotation$x)*10]

h4 <- ggplot(df_annotation)+
  geom_bar(mapping = aes(x = Group.1, y = 1), 
           stat = "identity",
           fill = df_annotation$colour,
           col = "#FFFFFF",
           width = 1)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))+
     facet_grid(.~colour, scales = "free_x")
     legend <- plot_grid(get_legend(h4), ncol = 1)
    legend_h4 <- get_legend(h4)
 h4 <- h4 + theme(legend.position = "none")

 h4 <- ggplot(data = df_annotation, 
              mapping = aes(x = Group.1, fill=x)
              ) +
       geom_bar(col = "#FFFFFF", width = 1)+
       theme_void() +
       theme(panel.spacing.x = unit(1, "mm")) +
       scale_fill_viridis()
 legend_h4 <- get_legend(h4)
 h4 <- h4 + theme(legend.position = "none")

### Plot
dot_plot_markers <- DotPlot(tenx.justwt.integrated, 
                            features = c(marker_genes_list, rev(mutant_genes_list)), 
                            group.by = "seurat_clusters_dot_plotting", 
                            dot.min = 0.00001,
                            assay = 'RNA') +
  theme_classic() +
  coord_fixed() +
  coord_flip() +
  # change appearance and remove axis elements, and make room for arrows
  theme(axis.text.x = element_text(size=16, angle = 45, hjust=1,vjust=1, family = "Arial"), 
        axis.text.y = element_text(size=16, face="italic"), 
        text=element_text(size=16, family="Arial", colour="black"), 
        legend.position = "bottom", 
        legend.direction = "horizontal", 
        legend.box = "vertical", 
        plot.title = element_blank(), 
        plot.margin = unit(c(1,3,1,3), "lines"), 
        axis.line = element_blank(),
        panel.border = element_rect(colour = "black", fill=NA, size=0.5)) +
  #change the colours
  #scale_colour_viridis(option = "inferno", guide = "colourbar", na.value="white", begin = 0, end = 1, direction = 1) +
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## change x axis label
  # labs(x = "Marker Genes", y = "Cluster", title = "Expression of Marker Genes by Cluster") 
  labs(x = "", y = "", title = "") +
  ## add arrows
  #annotate("segment", x = 5.5, xend = 5.5, y = 21.5, yend = 25, colour = "green", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
  #annotate("segment", x = 5.5, xend = 5.5, y = 16.5, yend = 21.5, colour = "red", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
  #annotate("segment", x = 5.5, xend = 5.5, y = 0, yend = 15.5, colour = "grey", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
  ## annotate asex
  geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
  ## annotate bipotential
  geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters, bipotential_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
  ## annotate sexes
  geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters, bipotential_clusters, male_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
  ## change label on bottom of plot so we can indicate markers
  scale_x_discrete(labels = rev(c("gd1", "md1", "md2", "md3", "md4", "md5", "fd1", "fd2", "fd3", "fd4", "msp8", "msp1", "mg1", "ccp2", "ap2g"))) +
  ## change label on bottom of plot so we can indicate markers
  scale_y_discrete(labels = c("Asexual 1", "Asexual 2" , "Asexual 3", "Asexual 4", "Asexual 5", "Asexual 6", "Asexual 7", "Asexual 8",  "Asexual 9", "Asexual 10", "Asexual 11", "Asexual 12", "Asexual 13", "Asexual 14", "Asexual 15", "Asexual 16", "Asexual 17", "Progenitor", "Male 1", "Male 2", "Female 1", "Female 2", "Female 3")) +
  ## change name of legends
  guides(col=guide_colorbar(title = 'Scaled Average Expression'),
         size=guide_legend("% of cells expressing"))
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
## view
#print(dot_plot_markers)

plot <- plot_grid(h4, h3, h1, dot_plot_markers, align = "v", ncol = 1, axis = "tb", rel_heights = c(0.5, 0.5, 0.5, 18)) 
Removed 37 rows containing missing values (geom_point).font family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font database
dot_plot <- plot_grid(plot, legend_h3, legend_h4, nrow = 1, rel_widths = c(10, 1.5, 1.5))

dot_plot

save

ggsave("../images_to_export/dot_plot_all.png", plot = dot_plot, device = "png", path = NULL, scale = 1, width = 29.7, height = 21, units = "cm", dpi = 300, limitsize = TRUE)

9. Subset sexual cells

Make a subsetted Seurat object of sexual cells.

Include the pre-branch too as well as any weird clusters that may have clustered out.

it’s been a while since we looked at the clusters so let’s check them out again:

## Plot
DimPlot(tenx.justwt.integrated, label = TRUE, repel = FALSE, pt.size = 0.05, group.by = "seurat_clusters", dims = c(1,2), reduction = "umap") + coord_fixed()

## plot
list_UMAPs_by_cluster[[1]] + list_UMAPs_by_cluster[[2]] + list_UMAPs_by_cluster[[3]] + list_UMAPs_by_cluster[[4]] + list_UMAPs_by_cluster[[5]] + list_UMAPs_by_cluster[[6]] + list_UMAPs_by_cluster[[7]] + list_UMAPs_by_cluster[[8]] + list_UMAPs_by_cluster[[9]] + list_UMAPs_by_cluster[[10]] + list_UMAPs_by_cluster[[11]] + list_UMAPs_by_cluster[[12]] + list_UMAPs_by_cluster[[13]] + list_UMAPs_by_cluster[[14]] + list_UMAPs_by_cluster[[15]] + list_UMAPs_by_cluster[[16]] + list_UMAPs_by_cluster[[17]] + list_UMAPs_by_cluster[[18]] + list_UMAPs_by_cluster[[19]] + list_UMAPs_by_cluster[[20]] + list_UMAPs_by_cluster[[21]] + list_UMAPs_by_cluster[[22]] + list_UMAPs_by_cluster[[23]] + list_UMAPs_by_cluster[[24]]

Define cells and subset

asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11) 
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)

## define cells
cell_names_subset_monocle_ids <- rownames(tenx.justwt.integrated@meta.data[tenx.justwt.integrated@meta.data$monocle_sex %in% c("Asexual_Early", "Bipotential", "Male", "Female"), ])

## 3, 0, 6, 5 are early clusters of asexuals before the branch
cell_names_subset_cluster_ids <- rownames(tenx.justwt.integrated@meta.data[tenx.justwt.integrated@meta.data$seurat_clusters %in% c(male_clusters, female_clusters, bipotential_clusters, 3, 0, 6, 5), ])

cell_names_subset_intersect <- intersect(cell_names_subset_monocle_ids, cell_names_subset_cluster_ids)

## subset cells into new object
tenx.justwt.integrated.sex <- subset(tenx.justwt.integrated, cells = cell_names_subset_intersect)

inspect/check

## inspect object
tenx.justwt.integrated.sex

## look at original UMAP
DimPlot(tenx.justwt.integrated.sex, label = TRUE, repel = TRUE, pt.size = 0.1, split.by = "experiment", dims = c(1,2), reduction = "umap") + coord_fixed()

Remove contaminant asexual cells

we want to remove:

## look at original UMAP
plot_sexual_subsetting <- DimPlot(tenx.justwt.integrated.sex, label = TRUE, repel = TRUE, pt.size = 0.1, dims = c(1,2), reduction = "umap") + 
  coord_fixed() + 
  geom_vline(aes(xintercept = -1, alpha = 5))

plot_sexual_subsetting
## extract cell embeddings
df_sex_cell_embeddings <- as.data.frame(tenx.justwt.integrated.sex@reductions[["umap"]]@cell.embeddings)

## subset anything lower than -0.8 in UMAP 2 and -0.1 in UMAP 1
remove_cells <- row.names(df_sex_cell_embeddings[which(df_sex_cell_embeddings$UMAP_1 < -1), ])

## plot these cells
DimPlot(tenx.justwt.integrated.sex, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = remove_cells, dims = c(1,2), reduction = "umap") + 
  coord_fixed() + 
  scale_color_manual(values=c("#000000", "#f54e1e")) + 
  theme_void() + 
  labs(title = paste("cells highlighted will be removed")) + 
  theme(plot.title = element_text(hjust = 0.5), legend.position = "none")

Final Subset

tenx.justwt.integrated.sex
An object of class Seurat 
10116 features across 2817 samples within 2 assays 
Active assay: RNA (5098 features, 0 variable features)
 1 other assay present: integrated
 2 dimensional reductions calculated: pca, umap

copy old clusters over

## copy old clusters
tenx.justwt.integrated.sex <- AddMetaData(tenx.justwt.integrated.sex, tenx.justwt.integrated.sex@meta.data$seurat_clusters, col.name = "post_integration_clusters")

10. Sex assignment of mutants

A. Seurat Method

## find transfer anchors
DefaultAssay(tenx.justwt.integrated) <- "integrated"
merge.anchors <- FindTransferAnchors(reference = tenx.justwt.integrated, query = GCSKO_mutants, 
    dims = 1:30)
Performing PCA on the provided reference using 2000 features as input.
Projecting PCA
Finding neighborhoods
Finding anchors
    Found 920 anchors
Filtering anchors
    Retained 610 anchors
## transfer data between ref and query
predictions <- TransferData(anchorset = merge.anchors, refdata = tenx.justwt.integrated$seurat_clusters_dot_plotting, 
    dims = 1:30)
Finding integration vectors
Finding integration vector weights
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Predicting cell labels
## add meta data to object
GCSKO_mutants <- AddMetaData(GCSKO_mutants, metadata = predictions)

## new object from this
mutant_seurat <- GCSKO_mutants

## look at breakdown of mutant by designation
df <- as.data.frame(mutant_seurat@meta.data)
table(df$predicted.id, df$identity_name_updated)
            
             fd1 fd2 fd3 fd4 gd1 md1 md2 md3 md4 md5 wild-type
  Asexual_1    2  43   5   7  10   4   4   6   5  42        37
  Asexual_10   6  29   5   4   9   3   9  10   1  27        42
  Asexual_12   0   1   0   0   1   1   1   0   0   0         0
  Asexual_13   2   4  10   1  35  10   4  32   5   0        15
  Asexual_14   1   3  17   3  40  25   8  15   8   4        40
  Asexual_15  19   6  19   2  24  10   4  25   6   2        25
  Asexual_16  27  13  30   4  33  49  13  75   8  14        63
  Asexual_17  10  23  15   4  13  25  13  38  12  19        48
  Female_1     2  51   5   0   6  23  64   2   1   4        37
  Female_3    19  11  14  23   0 105 141  86  14  21       160
  Male_1       0   1   0   0  48   0   0   2   9   1         2
  Male_2      65  55  14  45  14   0   0  45   8 112       220
## see how this overlaps with fluorescence sorted on for 820 to confirm accuracy
df_820 <- df[df$genetic_background =="PBANKA_820", ]
table(df_820$predicted.id, df_820$fluoresence_sorted_on)
               
                GFP Hoechst mCherry
  Asexual_Early   0      69       1
  Asexual_Late   33     459      16
  Female          2     342     240
  Male          182      49       0

inspect gd1

df[which(df$predicted.id == "Female_1" & df$identity_name_updated == "gd1"), ][, 118:146]

Calculate sex ratios

## use designations above for sexes
male_cells <- rownames(mutant_seurat@meta.data[mutant_seurat@meta.data$predicted.id == "Male", ])
female_cells <- rownames(mutant_seurat@meta.data[mutant_seurat@meta.data$predicted.id == "Female", ])
ss2_mutants_final_male <- subset(mutant_seurat, cells = male_cells)
ss2_mutants_final_female <- subset(mutant_seurat, cells = female_cells)

## inspect
ss2_mutants_final_male
An object of class Seurat 
5018 features across 636 samples within 1 assay 
Active assay: RNA (5018 features, 0 variable features)
ss2_mutants_final_female
An object of class Seurat 
5018 features across 788 samples within 1 assay 
Active assay: RNA (5018 features, 0 variable features)
## calculate sex ratios
##subset out H, sorted cells:
df_male <- ss2_mutants_final_male@meta.data[ss2_mutants_final_male@meta.data$exclude_for_sex_ratio == FALSE,]

dim(df_male)
[1] 454 124
df_female <- ss2_mutants_final_female@meta.data[ss2_mutants_final_female@meta.data$exclude_for_sex_ratio == FALSE,]

dim(df_female)
[1] 546 124
## make dataframe
df_sex_ratio <- merge(
  as.data.frame(table(df_male$sub_name_updated)), 
  as.data.frame(table(df_female$sub_name_updated)), 
  by = "Var1", all=TRUE)

# or use identity_updated

## add names
names(df_sex_ratio) <- c("genotype", "male", "female")

## change the NAs to 0
df_sex_ratio[is.na(df_sex_ratio)] <- 0

## collapse 820 wild-types together
combined_m <- df_sex_ratio[df_sex_ratio$genotype == "WT-820_3_5", ]$male + df_sex_ratio[df_sex_ratio$genotype == "WT-820", ]$male
combined_f <- df_sex_ratio[df_sex_ratio$genotype == "WT-820_3_5", ]$female + df_sex_ratio[df_sex_ratio$genotype == "WT-820", ]$female
df_sex_ratio <- rbind(df_sex_ratio, c("WT-820-combined", combined_m, combined_f))
invalid factor level, NA generated
# remove old rows
df_sex_ratio <- df_sex_ratio[-which(df_sex_ratio$genotype == "WT-820_3_5" | df_sex_ratio$genotype == "WT-820"), ]
# need to make numeric again
df_sex_ratio$male <- as.numeric(df_sex_ratio$male)
df_sex_ratio$female <- as.numeric(df_sex_ratio$female)
df_sex_ratio$genotype <- as.character(df_sex_ratio$genotype)
# add name for WT combined
df_sex_ratio$genotype[17] <- "WT-820-combined"

## calculate sex ratio
df_sex_ratio$sex_ratio <- (df_sex_ratio$male + 0.1)/(df_sex_ratio$female + 0.1)

## log sex ratio
df_sex_ratio$sex_ratio_log <- log10(df_sex_ratio$sex_ratio)

##view
df_sex_ratio
## remove WT-2 because it is really inappropriate to have a sex ratio for this:
df_sex_ratio <- df_sex_ratio[-which(df_sex_ratio$genotype == "WT-md5"),]

plot

B. SCMAP Method

Build the index

### Making an ortholog reference index

## load in mca data
#counts <- read.csv("../scmap/allpb10x_counts.csv", row.names = 1)
#pheno <- read.csv("../scmap/allpb10x_pheno.csv")
#ggplot(pheno, aes(x=PC2_3d, y = PC3_3d,colour=absclust3)) + geom_point()

## load required libraries
library(scmap) #https://bioconductor.org/packages/release/bioc/html/scmap.html 
library(SingleCellExperiment) #

#prep the SCE, if was originally a Suerat object need the dfs to be regular matrices
#pb_filtered_sce_orth <- pb_filtered_sce_orth[, colData(pb_filtered_sce_orth)$absclust3 != "8"]
#sce <- pb_filtered_sce_orth
#pca <- plotPCA(sce)
#pcs <- pca$data
#table(rownames(pcs)==colnames(sce))
#colData(sce) <- cbind(colData(sce), pcs)
#rowData(sce)$feature_symbol <- rowData(sce)$gene

## extract data from Seurat
#cells_tenx <- rownames(tenx.justwt.integrated@meta.data[which(tenx.justwt.integrated@meta.data$experiment == "tenx_5k"), ])
#tenx.justwt.integrated.10k <- subset(tenx.justwt.integrated, cells = cells_tenx)
counts = as.matrix(GetAssayData(tenx.justwt.integrated, slot = "counts", assay = "RNA"))
pheno = as.data.frame(tenx.justwt.integrated@meta.data)

## add UMAP coordinates
df_sex_cell_embeddings <- as.data.frame(tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings)
pheno <- cbind(pheno, df_sex_cell_embeddings)

## Set up object
sce <- SingleCellExperiment(list(counts=counts),
    colData=DataFrame(label=pheno),
    rowData=DataFrame(feature_symbol=rownames(counts)))
sce
class: SingleCellExperiment 
dim: 5098 6880 
metadata(0):
assays(1): counts
rownames(5098): PBANKA-0000101 PBANKA-0000301 ... PBANKA-MIT0360 PBANKA-MIT0370
rowData names(1): feature_symbol
colnames(6880): AAACCTGGTAAGGGCT AAACCTGGTGCACTTA ... SC25027_8_95 SC25027_8_96
colData names(127): label.orig.ident label.nCount_RNA ... label.UMAP_1 label.UMAP_2
reducedDimNames(0):
altExpNames(0):
## manual logging
#counts_1 <- assay(sce, "counts")
#libsizes <- colSums(counts_1)
#size.factors <- libsizes/mean(libsizes)
#logcounts(sce) <- log2(t(t(counts_1)/size.factors) + 1)
#counts(sce) <- as.matrix(counts(sce))
#logcounts(sce) <- as.matrix(logcounts(sce))
logcounts(sce) <- as.matrix(GetAssayData(tenx.justwt.integrated, slot = "data", assay = "RNA"))

## remove features with duplicated names
sce <- sce[!duplicated(rownames(sce)), ]

## build scmap-cell reference index, save this rds
sce <- selectFeatures(sce, suppress_plot = FALSE, n_features = 2000)

table(rowData(sce)$scmap_features)

FALSE  TRUE 
 3098  2000 
set.seed(1)
sce <- indexCell(sce, M = 50, k = 80)
names(metadata(sce)$scmap_cell_index)
[1] "subcentroids" "subclusters" 
length(metadata(sce)$scmap_cell_index$subcentroids)
[1] 50
dim(metadata(sce)$scmap_cell_index$subcentroids[[1]])
[1] 40 80
metadata(sce)$scmap_cell_index$subcentroids[[1]][,1:5]
                        1           2           3           4          5
PBANKA-0100200 0.14630862 0.526027391 0.043704597 0.084187544 0.27749519
PBANKA-0100700 0.09460507 0.073257106 0.313286733 0.370676472 0.27852452
PBANKA-0100900 0.09923532 0.087566303 0.059203198 0.094759828 0.06865377
PBANKA-0101000 0.04763362 0.033790484 0.035038280 0.029494053 0.09082912
PBANKA-0101200 0.05167727 0.100910226 0.041157644 0.087829422 0.04947390
PBANKA-0101300 0.03427392 0.000000000 0.006482623 0.017440407 0.04685456
PBANKA-0101400 0.12483604 0.023321169 0.056473567 0.214402432 0.30191506
PBANKA-0101500 0.03194160 0.050889592 0.037542520 0.040676819 0.02895481
PBANKA-0101600 0.03798147 0.007504403 0.028087301 0.055265292 0.02882148
PBANKA-0101800 0.28129376 0.098383830 0.133405404 0.110619710 0.08911439
PBANKA-0101900 0.04066573 0.007504403 0.033564638 0.036681755 0.18011835
PBANKA-0102000 0.07611798 0.066861770 0.012354195 0.077811108 0.09136082
PBANKA-0102200 0.06226485 0.251753986 0.605044037 0.076494250 0.11263897
PBANKA-0102800 0.06418848 0.030164171 0.042579458 0.058224917 0.09315054
PBANKA-0103100 0.11743208 0.010838894 0.026223349 0.027978543 0.07163404
PBANKA-0103800 0.04052086 0.092749297 0.037663604 0.051866920 0.05463320
PBANKA-0104100 0.10698392 0.000000000 0.000000000 0.021640058 0.01323152
PBANKA-0104900 0.15330358 0.040643918 0.047150266 0.096686040 0.08016202
PBANKA-0105100 0.00774803 0.000000000 0.000000000 0.000000000 0.01212389
PBANKA-0105200 0.14295074 0.090669292 0.195180247 0.390381118 0.10203677
PBANKA-0105300 0.07266638 0.035247576 0.083080416 0.026830263 0.08893071
PBANKA-0105500 0.05972304 0.130442176 0.613873789 0.124039039 0.06698287
PBANKA-0105600 0.10822754 0.303080159 0.124325455 0.169514333 0.18211011
PBANKA-0106100 0.08682023 0.000000000 0.037839217 0.049002485 0.05906248
PBANKA-0106300 0.01815899 0.000000000 0.007580943 0.002964005 0.01617340
PBANKA-0106500 0.06709366 0.160165980 0.103227974 0.026045686 0.29711821
PBANKA-0106600 0.06615049 0.041341810 0.011974030 0.090413046 0.08166804
PBANKA-0106900 0.06772833 0.099129853 0.058992669 0.401040855 0.09974455
PBANKA-0106950 0.07031790 0.010838894 0.008323748 0.062299847 0.04044479
PBANKA-0107100 0.17579265 0.059474584 0.032474136 0.119798580 0.10795614
PBANKA-0107300 0.52569711 0.140088244 0.000000000 0.452049173 0.43157921
PBANKA-0107400 0.40027934 0.554030211 0.028538986 0.179514374 0.30285506
PBANKA-0107500 0.06729382 0.074821209 0.023884962 0.041973749 0.10242388
PBANKA-0107600 0.10346987 0.295645683 0.107376132 0.291553393 0.09681055
PBANKA-0108200 0.10862989 0.032331520 0.101822888 0.088844686 0.07713661
PBANKA-0108500 0.02145669 0.030583685 0.024050668 0.031126013 0.02086511
PBANKA-0108600 0.24498462 0.045568380 0.056630631 0.095147870 0.18860540
PBANKA-0108700 0.39883605 0.152992706 0.123346730 0.108083282 0.35455198
PBANKA-0108800 0.08791660 0.022213910 0.000000000 0.016375691 0.04375594
PBANKA-0108900 0.03944358 0.012735216 0.017821829 0.063455344 0.06855602
dim(metadata(sce)$scmap_cell_index$subclusters)
[1]   50 6880
#saveRDS(pb_filtered_sce_orth, file="pb_filtered_sce_orthindex_20181109.rds")

Map to the index

scmapCell_results$wt$cells[, 1:3]
      SC26779_5_100 SC26779_5_101 SC26779_5_102
 [1,]          6554          6685          6163
 [2,]          6247          6528          6200
 [3,]          6734          6436          6466
 [4,]          6323          6588          6124
 [5,]          6468          6470          6871
 [6,]          6685          6323          6563
 [7,]          6470          6606          6456
 [8,]          6436          6614          6149
 [9,]          6515          6608          6494
[10,]          6614          6854          6647
## get cell indexes 
getcells <- scmapCell_results$wt$cells[1, ]
## exctract cells metadata from index
cdsce <- colData(sce)[getcells, ]
## extract similarities
topsim <- scmapCell_results$wt$similarities[1, ]
## get name of top cell matched
mutants.sce$topcell <- rownames(cdsce)
## get sex id
mutants.sce$topcell_ac <- cdsce$label.monocle_sex
## get coordinates of matched cell
mutants.sce$indexPC1 <- cdsce$label.UMAP_1
mutants.sce$indexPC2 <- cdsce$label.UMAP_2
#mutants.sce$pbpt <- cdsce$pseudotime
#mutants.sce$pbbulk <- cdsce$bulk
mutants.sce$topcell_sp <- mutants.sce$topcell_ac
mutants.sce$topsim <- topsim
mutants.sce$topcell_sp[mutants.sce$topsim < 0.4] <- "unassigned"
table(mutants.sce$topcell_sp)

Asexual_Early  Asexual_Late        Female          Male    unassigned 
          335           346           186           548          1302 
## see how this overlaps with fluorescence sorted on for 820 to confirm accuracy
df_820 <- colData(mutants.sce)[colData(mutants.sce)$label.genetic_background =="PBANKA_820", ]
table(df_820$topcell_sp, df_820$label.fluoresence_sorted_on)
               
                GFP Hoechst mCherry
  Asexual_Early   0      89       2
  Asexual_Late    5     150       3
  Female          2      84      54
  Male          158      47       0
  unassigned     52     549     198
table(colData(mutants.sce)$topcell_sp, colData(mutants.sce)$label.identity_name_updated)
               
                fd1 fd2 fd3 fd4 gd1 md1 md2 md3 md4 md5 wild-type
  Asexual_Early   9  74  13   9  20  12  17  20   8  67        86
  Asexual_Late   14  21  37   8  36  45  21  43  20  22        79
  Female          1   2   7   2   0  17  39  44   3   6        65
  Male           64  51  14  41  25   0   0  34  12 104       203
  unassigned     65  92  63  33 152 181 184 195  34  47       256
#### TOP 3NN method

#This function makes a list of the PC means for each cell and then do.call below rbinds them into a dataframe called big_data

datalist = list()

for (i in colnames(scmapCell_results$wt$cells)) {
  
  getcellstest <- scmapCell_results$wt$cells[1:3, i]
  cdscetest <- colData(sce)[getcellstest, ]
  PC1mean <- mean(cdscetest$label.UMAP_1)
  PC2mean <- mean(cdscetest$label.UMAP_2)
  # ... make some data
  dat <- data.frame(i, PC1mean, PC2mean)
  dat$i <- i  # maybe you want to keep track of which iteration produced it?
  datalist[[i]] <- dat # add it to your list
}

big_data = do.call(rbind, datalist)
# or big_data <- dplyr::bind_rows(datalist)
# or big_data <- data.table::rbindlist(datalist)

test <- big_data[1, ]

df <- data.frame(X=colData(sce)$label.UMAP_1, Y=colData(sce)$label.UMAP_2, row.names = rownames(colData(sce)))

#the snap function snaps to the nearest cell in PC coordiantes
snap <- function(df, test){
  require(Biobase)
  d <- matchpt(as.matrix(df),
               as.matrix(data.frame(X=test$PC1mean,Y=test$PC2mean)))
  
  min_row <- rownames(d[d$distance==min(d$distance),])
  
  test$X_snap <- unique(df[min_row,"X"])
  test$Y_snap <- unique(df[min_row,"Y"])
  test$pb_cell <- min_row
  
  test
}

#this loops through each cell and in big_data and runs the snap function
datalist2 = list()
colnames(big_data) <- c("sample_id", "PC1mean", "PC2mean")
for (i in rownames(big_data)) {
  test <- big_data[i, ]
  coord <- snap(df, test)
  coord$i <- i
  datalist2[[i]] <- coord
}
big_data2 = do.call(rbind, datalist2)


table(rownames(big_data2)==rownames(colData(mutants.sce)))

TRUE 
2717 
allpbcd <- colData(sce)
allpbcd <- as.data.frame(allpbcd)
pbabsclust <- allpbcd[, c("label.pt_id_cols", "label.identity_name_updated", "label.monocle_sex"), drop=FALSE]
pbabsclust$pb_sample_id <- rownames(pbabsclust)

# Now merge the pc cell asignments with their abs clust and get in the right order
big_data3 <- merge(big_data2, pbabsclust, by.x = "pb_cell", by.y = "pb_sample_id", all.x=TRUE, all.y=FALSE)
big_data4 <- big_data3[match(rownames(big_data2), big_data3$sample_id), ]


colors <- c("6"="#78C679",
            "2"="#D1EC9F",
            "0"="#FEB24C",
            "1"="#F4CF63",
            "3"="#FEEEAA",
            "4"="#85B1D3",
            "7"="#9ecae1",
            "5"="#C9E8F1",
            "M"= "#B7B7D8",
            "F"="#9C96C6",
            "unassigned"="black")

ggplot(big_data4, aes(PC1mean, PC2mean)) +
  geom_point(col = big_data4$label.pt_id_cols) +
                     theme_void() +
                     coord_fixed()


#+ geom_point(aes(colour=factor(big_data4$label.label.pt_id_cols))) + scale_color_manual(values = colors) 

                     

ggplot(big_data4, aes(X_snap, Y_snap)) +
  geom_point(col = big_data4$label.pt_id_cols) +
                     theme_void() +
                     coord_fixed() 


#+ geom_point(aes(colour=factor(big_data4$label.label.pt_id_cols))) + scale_color_manual(values = colors) 

##add info to SCE and save colData, to be an assigned cell all 3NN must have a cos sim >0.4
scmapCell_results$wt$similarities[, 1:3]
      SC26779_5_100 SC26779_5_101 SC26779_5_102
 [1,]     0.2654243     0.3283723     0.4455592
 [2,]     0.2657229     0.3322655     0.4476295
 [3,]     0.2649363     0.3346425     0.4463425
 [4,]     0.2685002     0.3314723     0.4455779
 [5,]     0.2655940     0.3347640     0.4463982
 [6,]     0.2654118     0.3351348     0.4472700
 [7,]     0.2659329     0.3289516     0.4482864
 [8,]     0.2683255     0.3303473     0.4451670
 [9,]     0.2668498     0.3284948     0.4503529
[10,]     0.2664087     0.3284307     0.4486978
topsim1 <- scmapCell_results$wt$similarities[1, ]
topsim2 <- scmapCell_results$wt$similarities[2, ]
topsim3 <- scmapCell_results$wt$similarities[3, ]

table(big_data4$sample_id==rownames(colData(mutants.sce)))

TRUE 
2717 
#pfobj$pb_cell <- big_data4$pb_cell
#pfobj$PC1mean <- big_data4$PC1mean
bd4 <- big_data4[, c("pb_cell", "sample_id", "PC1mean", "PC2mean", "X_snap", "Y_snap", "label.pt_id_cols", "label.identity_name_updated", "label.monocle_sex")]

colData(mutants.sce) <- cbind(colData(mutants.sce), bd4)

mutants.sce$topsim1 <- topsim1
mutants.sce$topsim2 <- topsim2
mutants.sce$topsim3 <- topsim3
mutants.sce$stage_pred <- big_data4$label.monocle_sex
mutants.sce$stage_pred[mutants.sce$topsim1 < 0.4 | mutants.sce$topsim2 < 0.4 | mutants.sce$topsim3 < 0.4] <- "unassigned"
table(mutants.sce$stage_pred)

Asexual_Early  Asexual_Late   Bipotential        Female          Male    unassigned 
          325           345             1           182           546          1318 
#write.csv(bd4, "pfcellassignmentswithmean3nn_20181029.csv")
#write.csv(colData(pfobj), "pf3d7100scmapclusts2methodindexn100_20190107.csv")
## see how this overlaps with fluorescence sorted on for 820 to confirm accuracy
df_820 <- colData(mutants.sce)[colData(mutants.sce)$label.genetic_background =="PBANKA_820", ]
table(df_820$stage_pred, df_820$label.fluoresence_sorted_on)
               
                GFP Hoechst mCherry
  Asexual_Early   1      81       2
  Asexual_Late    3     153       3
  Bipotential     0       1       0
  Female          2      83      52
  Male          156      47       0
  unassigned     55     554     200
table(colData(mutants.sce)$stage_pred, colData(mutants.sce)$label.identity_name_updated)
               
                fd1 fd2 fd3 fd4 gd1 md1 md2 md3 md4 md5 wild-type
  Asexual_Early   7  73  11   9  20  10  15  20   7  68        85
  Asexual_Late   16  22  39   8  34  46  23  41  20  21        75
  Bipotential     0   0   0   0   0   0   0   0   0   0         1
  Female          1   2   7   2   0  16  39  44   3   6        62
  Male           64  51  14  41  25   0   0  33  12 104       202
  unassigned     65  92  63  33 154 183 184 198  35  47       264

make final plot with MCA data below

## Read in MCA data
pheno <- read.csv("../scmap/allpb10x_pheno.csv")
## extract rows needed for plotting
df_plot <- pheno[,c("PC2_3d", "PC3_3d", "absclust3")]
df_plot$experiment <- df_plot$absclust3

## extract rows needed for plotting from mapped object
df_plot_pm <- as.data.frame(colData(pm_ss2_field.sce.orth))
df_plot_pm <- df_plot_pm[ ,c("X_snap", "Y_snap", "label.absclust3")]
df_plot_pm$experiment <- "pm"
colnames(df_plot_pm) <- c("PC2_3d", "PC3_3d", "absclust3", "experiment")

## extract rows needed for plotting from mapped object
df_plot_pf <- as.data.frame(colData(pf_ss2_field.sce.orth))
df_plot_pf <- df_plot_pf[ ,c("X_snap", "Y_snap", "label.absclust3")]
df_plot_pf$experiment <- "pf"
colnames(df_plot_pf) <- c("PC2_3d", "PC3_3d", "absclust3", "experiment")

## bind together
df_plot <- rbind(df_plot, df_plot_pf, df_plot_pm)

colors <- c("6"="#78C679",
            "2"="#D1EC9F",
            "0"="#FEB24C",
            "1"="#F4CF63",
            "3"="#FEEEAA",
            "4"="#85B1D3",
            "7"="#9ecae1",
            "5"="#C9E8F1",
            "M"= "#B7B7D8",
            "F"="#9C96C6",
            "unassigned"="black",
            "pm" = "#dc65a4",
            "pf" = "#24245f")

## add alpha for plotting
df_plot$alpha[!df_plot$experiment == "pm"] <- 0.5
df_plot$alpha[df_plot$experiment == "pm"] <- 1
df_plot$alpha[df_plot$experiment == "pf"] <- 1

##plot
scmap_pca <- ggplot(df_plot, aes(x=PC2_3d, y = PC3_3d,colour=experiment, alpha = alpha)) + 
  geom_point() +
  theme_void() +
  scale_color_manual(values = colors) +
  ## remove alpha scale
  scale_alpha_continuous(guide=FALSE)
  ## draw ring around field samples
  #geom_point(data=df_plot[df_plot$experiment == "pm", ], pch=21, fill=NA, size=2, colour="black", stroke=1) +
  ## make non-field samples more opaque
  #geom_point(data=df_plot[-df_plot$experiment == "pm", ], alpha = 0.2)

## view
scmap_pca_2 <- scmap_pca + 
  guides(colour=guide_legend(override.aes = list(size=4)))

scmap_pca_2

save

#ggsave("../images_to_export/field_samples_scmap.png", plot = scmap_pca_2, device = "png", path = NULL, scale = 1, width = 15, height = 10, units = "cm", dpi = 300, limitsize = TRUE)

Overlap and validation

11. Save and Export

save

Save environment

## This saves everything in the global environment for easy recall later
#save.image(file = "GCSKO_merge.RData")
#load(file = "GCSKO_merge.RData")

Save object(s)

## Save an object to a file
saveRDS(tenx.justwt.integrated.sex, file = "../data_to_export/tenx.justwt.integrated.sex.RDS")
## Restore the object
#readRDS(file = "../data_to_export/tenx.mutant.integrated.sex.RDS")

## save integrated object to file
saveRDS(tenx.justwt.integrated, file = "../data_to_export/tenx.justwt.integrated.RDS") 
## restore the object
#tenx.justwt.integrated <- readRDS("../data_to_export/tenx.justwt.integrated.RDS")

## save monocle object
saveRDS(monocle.object.all, file = "../data_to_export/monocle.object.all.RDS") 
## restore the object
#monocle.object.all <- readRDS("../data_to_export/monocle.object.all.RDS")

## save integrated object to file
#saveRDS(GCSKO_mutants, file = "../data_to_export/GCSKO_mutants.RDS") 
## restore the object
#tenx.justwt.integrated <- readRDS("../data_to_export/tenx.justwt.integrated.RDS")

## extract UMAP coordinates and metadata
## extract metadata
mca_website_df <- tenx.justwt.integrated@meta.data

## add umap - merge on row names and then make sure row names are not dropped or added
mca_website_df <- transform(merge(mca_website_df, tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings, by=0, all=TRUE), row.names=Row.names, Row.names=NULL)

## send this to sunil:
# ggplot(mca_website_df, aes(x = UMAP_1, y = UMAP_2)) +
# geom_point(col = mca_website_df$pt_id_cols) +
# theme_void() +
# coord_fixed()

## export
write.csv(mca_website_df, file = "../data_to_export/mca_website_df.csv")

Clean up

#rm(ss2_wt_cells)
#rm(tenx.justwt.integrated)
#rm(tenx.justwt.list)

final figure construction

Cowplot(plot_grid), patchwork(wrap_plots), and ggpubr can all allow multiple plots to be plotted together.

## A
# umap_id_pt
## B
# marker gene expression
## C
# Mutant gene expression
## D
# Modules

#Figure_A <- grid.arrange(arrangeGrob(QC_composite_plot, QC_mito_violin, QC_mito_graph, QC_by_genotype, mapping_rate_plot), nrow=3), nrow=2, heights=c(10,2))

## cowplot method
## can use this for labels: toupper(letters)[1:10]

## C. Mutant genes
mutant_genes_figure <- plot_grid(
                                 ## marker genes starts
                                 marker_gene_plot_FAMB,
                                 marker_gene_plot_MSP8,
                                 marker_gene_plot_MSP1,
                                 marker_gene_plot_AP2G,
                                 marker_gene_plot_CCP2,
                                 marker_gene_plot_MG1,
                                 ## mutant genes starts  
                                 marker_gene_plot_gd1,
                                 marker_gene_plot_md1,
                                 marker_gene_plot_md2,
                                 marker_gene_plot_md3,
                                 marker_gene_plot_md4, 
                                 marker_gene_plot_md5,
                                 marker_gene_plot_fd1,
                                 marker_gene_plot_fd2,
                                 marker_gene_plot_fd3,
                                 marker_gene_plot_fd4, 
  label_size = 1, 
  nrow = 4)

## labels as uppercase letters:
#c(toupper(letters)[2:17])
# labels as roman numerals:
# c(tolower(as.roman(c(2:17))

Figure_publication <- plot_grid(umap_id_pt + theme(plot.margin = unit(c(0, 0, 0, 0), "cm")), 
                                mutant_genes_figure,
                                ## add empty plot to give spacing
                                ggplot() + theme_void(),
                                labels = c('A', 'B'), 
                                label_size = 12, 
                                ncol = 2, 
                                nrow=2, 
                                rel_heights = c(1, 1, 4), 
                                rel_widths = c(1, 2, 3))

Figure_publication

save

ggsave("../images_to_export/Figure_C.png", plot = Figure_publication, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
layout <- "
AAABBB
CCDDEE
FFGGHH
IIJJKK
"

Figure_publication_A <- (composition_umap_10x + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10))) +
(composition_umap_ss2 + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10))) + 
plot_layout(ncol = 2)

Figure_publication <- Figure_publication_A +
#(UMAP_hoo + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10)) + labs(title="Predicted Asexual Cycle Real Timepoint")) +
#(UMAP_kasia + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10)) + labs(title="Predicted Gametocytogenesis Real Timepoint")) +
  ## marker genes
marker_gene_plot_FAMB +
marker_gene_plot_MSP8 +
marker_gene_plot_MSP1 +
marker_gene_plot_CDPK5 +
marker_gene_plot_AP2G +
marker_gene_plot_CCP2 +
marker_gene_plot_MG1 +
marker_gene_plot_ORC1 +
marker_gene_plot_MCM4 +
  ## mutant genes starts  
#marker_gene_plot_gd1 +
#marker_gene_plot_md1 +
#marker_gene_plot_md2 +
#marker_gene_plot_md3 +
#marker_gene_plot_md4 + 
#marker_gene_plot_md5 +
#marker_gene_plot_fd1 +
#marker_gene_plot_fd2 +
#marker_gene_plot_fd3 +
#marker_gene_plot_fd4 + 
plot_layout(ncol = 6, 
            widths = c(1, 1, 1, 1, 1 ,1),
            heights = c(2, 1, 1, 1),
            design = layout
            )

Figure_publication

save

ggsave("../images_to_export/Figure_S2.png", plot = Figure_publication, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)

Appendix

Session Info

R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
 [1] stats4    parallel  grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] colourvalues_0.3.7          readxl_1.3.1                destiny_3.2.0               circlize_0.4.12            
 [5] gganimate_1.0.7             shiny_1.5.0                 monocle3_0.2.3.0            SingleCellExperiment_1.10.1
 [9] SummarizedExperiment_1.18.2 DelayedArray_0.14.1         matrixStats_0.57.0          GenomicRanges_1.40.0       
[13] GenomeInfoDb_1.24.2         IRanges_2.22.2              S4Vectors_0.26.1            Biobase_2.48.0             
[17] BiocGenerics_0.34.0         Nebulosa_1.2.0              reshape2_1.4.4              Hmisc_4.4-1                
[21] Formula_1.2-4               survival_3.2-7              lattice_0.20-41             gridExtra_2.3              
[25] dplyr_1.0.2                 patchwork_1.0.1             ggplot2bdc_0.3.2            cowplot_1.1.0              
[29] ggpubr_0.4.0                ggplot2_3.3.2               viridis_0.5.1               viridisLite_0.3.0          
[33] Seurat_3.2.2                colorspace_1.4-1           

loaded via a namespace (and not attached):
  [1] ggthemes_4.2.4            coda_0.19-4               tidyr_1.1.2               knitr_1.30               
  [5] irlba_2.3.3               data.table_1.13.2         rpart_4.1-15              RCurl_1.98-1.2           
  [9] generics_0.0.2            callr_3.5.1               leidenbase_0.1.2          usethis_1.6.3            
 [13] RANN_2.6.1                proxy_0.4-24              future_1.19.1             spatstat.data_1.4-3      
 [17] httpuv_1.5.4              assertthat_0.2.1          gifski_0.8.6              xfun_0.18                
 [21] hms_0.5.3                 evaluate_0.14             promises_1.1.1            DEoptimR_1.0-8           
 [25] fansi_0.4.1               progress_1.2.2            DBI_1.1.0                 igraph_1.2.6             
 [29] htmlwidgets_1.5.2         spdep_1.1-5               purrr_0.3.4               ellipsis_0.3.1           
 [33] RSpectra_0.16-0           crosstalk_1.1.0.1         ks_1.12.0                 backports_1.1.10         
 [37] deldir_0.1-29             vctrs_0.3.4               TTR_0.24.2                remotes_2.2.0            
 [41] ROCR_1.0-11               abind_1.4-5               RcppEigen_0.3.3.7.0       withr_2.3.0              
 [45] grr_0.9.5                 robustbase_0.93-7         checkmate_2.0.0           vcd_1.4-8                
 [49] sctransform_0.3.1         xts_0.12.1                prettyunits_1.1.1         mclust_5.4.7             
 [53] goftest_1.2-2             cluster_2.1.0             lazyeval_0.2.2            laeken_0.5.1             
 [57] crayon_1.3.4              units_0.6-7               slam_0.1-47               pkgconfig_2.0.3          
 [61] labeling_0.4.2            tweenr_1.0.1              nlme_3.1-149              pkgload_1.1.0            
 [65] nnet_7.3-14               devtools_2.3.2            rlang_0.4.8               globals_0.13.1           
 [69] lifecycle_0.2.0           miniUI_0.1.1.1            rsvd_1.0.3                cellranger_1.1.0         
 [73] rprojroot_1.3-2           polyclip_1.10-0           RcppHNSW_0.3.0            lmtest_0.9-38            
 [77] Matrix_1.2-18             raster_3.3-13             carData_3.0-4             Matrix.utils_0.9.8       
 [81] boot_1.3-25               zoo_1.8-8                 base64enc_0.1-3           ggridges_0.5.2           
 [85] GlobalOptions_0.1.2       processx_3.4.4            pheatmap_1.0.12           png_0.1-7                
 [89] bitops_1.0-6              KernSmooth_2.23-17        DelayedMatrixStats_1.10.1 classInt_0.4-3           
 [93] shape_1.4.5               stringr_1.4.0             jpeg_0.1-8.1              rstatix_0.6.0            
 [97] ggsignif_0.6.0            scales_1.1.1              memoise_1.1.0             magrittr_2.0.1           
[101] plyr_1.8.6                hexbin_1.28.1             ica_1.0-2                 gdata_2.18.0             
[105] zlibbioc_1.34.0           compiler_4.0.3            RColorBrewer_1.1-2        pcaMethods_1.80.0        
[109] fitdistrplus_1.1-1        cli_2.1.0                 LearnBayes_2.15.1         XVector_0.28.0           
[113] listenv_0.8.0             pbapply_1.4-3             ps_1.4.0                  htmlTable_2.1.0          
[117] ggplot.multistats_1.0.0   MASS_7.3-53               mgcv_1.8-33               tidyselect_1.1.0         
[121] stringi_1.5.3             forcats_0.5.0             yaml_2.2.1                latticeExtra_0.6-29      
[125] ggrepel_0.8.2             pbmcapply_1.5.0           tools_4.0.3               future.apply_1.6.0       
[129] rio_0.5.16                rstudioapi_0.11           foreign_0.8-80            smoother_1.1             
[133] scatterplot3d_0.3-41      farver_2.0.3              Rtsne_0.15                digest_0.6.27            
[137] BiocManager_1.30.10       Rcpp_1.0.6                car_3.0-10                broom_0.7.2              
[141] later_1.1.0.1             RcppAnnoy_0.0.16          httr_1.4.2                sf_0.9-6                 
[145] fs_1.5.0                  tensor_1.5                ranger_0.12.1             reticulate_1.18          
[149] splines_4.0.3             uwot_0.1.8                expm_0.999-5              spatstat.utils_1.17-0    
[153] sp_1.4-4                  spData_0.3.8              plotly_4.9.2.1            sessioninfo_1.1.1        
[157] xtable_1.8-4              jsonlite_1.7.1            spatstat_1.64-1           testthat_2.3.2           
[161] R6_2.5.0                  gmodels_2.18.1            pillar_1.4.6              htmltools_0.5.1.1        
[165] mime_0.9                  glue_1.4.2                fastmap_1.0.1             VIM_6.1.0                
[169] class_7.3-17              codetools_0.2-16          utf8_1.1.4                pkgbuild_1.1.0           
[173] mvtnorm_1.1-1             tibble_3.0.4              curl_4.3                  leiden_0.3.3             
[177] gtools_3.8.2              zip_2.1.1                 openxlsx_4.2.2            limma_3.44.3             
[181] rmarkdown_2.5             desc_1.2.0                munsell_0.5.0             e1071_1.7-4              
[185] GenomeInfoDbData_1.2.3    haven_2.3.1               gtable_0.3.0             

Expression plots

Expression - raw - Viridis

## find a good ring marker, to see if there is a better one than the ones reported
#markers_ring <- FindMarkers(tenx.justwt.integrated, ident.1 = c("4", "5", "16", "11", "7", "3", "9", "0", "22"))
#head(markers_ring)

# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/ 


marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1319500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(ccp2), "(Female)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
Error in italic(ccp2) : could not find function "italic"

Expression - raw - purple


marker_gene_ramp <- colorRampPalette(c("#D3D3D3", "#1D1564"))(50)

marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1319500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("CCP2 (Female)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0416100", coord.fixed = TRUE, 
                                    #min.cutoff = "q1", 
                                    dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("MG1 (Male)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1437500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("AP2G (Commitment)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0831000", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("MSP1 (Schizont)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1102200", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("MSP8 (Asexual)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1101300", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("SBP1 (Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0722600", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("Fam-b2 (Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_HSP70 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0711900", coord.fixed = TRUE, 
                                      #min.cutoff = "q1", 
                                      dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("(HSP70; Reporter)","\n", "PBANKA_0711900")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)
font family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font databasefont family 'Arial' not found in PostScript font database
marker_gene_plot_all

Expression - data - purple

# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD5
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5


marker_gene_plot_17 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1418100", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_2 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0102400", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_19 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0716500", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_20 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1435200", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_13 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0902300", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_10 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0413400", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md5")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_3 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0828000", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("gd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_oom <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1302700", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_29 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1447900", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
marker_gene_plot_21 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1454800", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))

## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_17 , marker_gene_plot_2 , marker_gene_plot_19 , marker_gene_plot_20 , marker_gene_plot_13 , marker_gene_plot_10 , marker_gene_plot_3 , marker_gene_plot_oom , marker_gene_plot_29 , marker_gene_plot_21 , ncol = 4)
           
## print
mutant_expression_composite

Density Plots

Density - purple

# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/ 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_colour_gradientn(colours=marker_gene_ramp) + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_colour_gradientn(colours=marker_gene_ramp) + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
UMAP_composite_mutant_genes

Density -viridis

# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/ 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)) )+ guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
UMAP_composite_mutant_genes

Density - viridis

# PBANKA-0828000         GCSKO-3  GD1

# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-0413400    GCSKO-10_820  MD5

# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-1435200        GCSKO-20  FD4 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)) )+ guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
UMAP_composite_mutant_genes

# PBANKA-0828000         GCSKO-3  GD1

# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-0413400    GCSKO-10_820  MD5

# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-1435200        GCSKO-20  FD4 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "wkde", slot = 'data')

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_color_continuous_sequential(palette = "Purples 2") + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_color_continuous_sequential(palette = "Purples 2") + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + 
                               coord_fixed() + 
                               theme_void() + 
                               labs(title = paste("md2")) + scale_color_continuous_sequential(palette = "Purples 2") + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
## old colour scale: scale_colour_gradientn(colours=marker_gene_ramp )

UMAP_composite_mutant_genes

---
subtitle: 'Gametocyte Development in <i>Plasmodium berghei</i>'
title: |
  ![](../GCSKO_logo.jpg){width=300px}  
  Merging Smart-seq2 and 10X Datasets - wild-type only
author: "[Andrew Russell](https://ajcrussell.wixsite.com/mysite/about)"
institute: Wellcome Sanger Institute
date: '`r format(Sys.Date(), "%B %d, %Y")`'
output:
  html_notebook:
    theme: cosmo
    toc: yes
    toc_depth: 3
    #toc_float: yes
    df_print: paged
---
***
# 1. Introduction and Aims {.tabset}

We have quality-controlled the 10X data and the SS2 data and now are left with the following objects:

10X 5K data - pb_sex_filtered

10X 30K data - pb_30k_sex_filtered 

SS2 mutant data - ss2_mutants_final

# 2. Read in the data  {.tabset}

### Load/Install the Required Packages

```{r load packages, echo = FALSE}
## CRAN packages

## Pathwork is needed to stich plots together using '+'
if(require("patchwork", quietly = TRUE)){
    print("patchwork is loaded correctly")
} else {
    print("trying to install patchwork")
    install.packages("patchwork")
    if(require(patchwork)){
        print("patchwork installed and loaded")
    } else {
        stop("could not install patchwork")
    }
}

## viridis allows different colours to be added to plots
if(require("viridis", quietly = TRUE)){
    print("viridis is loaded correctly")
} else {
    print("trying to install viridis")
    install.packages("viridis")
    if(require(viridis)){
        print("viridis installed and loaded")
    } else {
        stop("could not install viridis")
    }
}

## Seurat is needed for most of this script
if(require("Seurat", quietly = TRUE)){
    print("Seurat is loaded correctly")
} else {
    print("trying to install Seurat")
    install.packages("Seurat")
    if(require(Seurat)){
        print("Seurat installed and loaded")
    } else {
        stop("could not install Seurat")
    }
}

## cowplot is needed for plots in this script
if(require("cowplot")){
    print("cowplot is loaded correctly")
} else {
    print("trying to install cowplot")
    install.packages("cowplot")
    if(require(cowplot)){
        print("cowplot installed and loaded")
    } else {
        stop("could not install cowplot")
    }
}

## gridExtra is needed for grid graphics to plot multiple plots in the same view
if(require("gridExtra")){
    print("gridExtra is loaded correctly")
} else {
    print("trying to install gridExtra")
    install.packages("gridExtra")
    if(require(gridExtra)){
        print("gridExtra installed and loaded")
    } else {
        stop("could not install gridExtra")
    }
}

## grid is needed for grid.arrange function to change size of title
if(require("grid")){
    print("grid is loaded correctly")
} else {
    print("trying to install grid")
    install.packages("grid")
    if(require(grid)){
        print("grid installed and loaded")
    } else {
        stop("could not install grid")
    }
}

##for doing bulk correlation calculations
if(require("Hmisc")){
    print("Hmisc is loaded correctly")
} else {
    print("trying to install Hmisc")
    install.packages("Hmisc")
    if(require(Hmisc)){
        print("Hmisc installed and loaded")
    } else {
        stop("could not install Hmisc")
    }
}

## reshape2 to melt dataframes for plotting:
if(require("reshape2")){
    print("reshape2 is loaded correctly")
} else {
    print("trying to install reshape2")
    install.packages("reshape2")
    if(require(reshape2)){
        print("reshape2 installed and loaded")
    } else {
        stop("could not install reshape2")
    }
}

## to work with data frames:
if(require("dplyr")){
    print("dplyr is loaded correctly")
} else {
    print("trying to install dplyr")
    install.packages("dplyr")
    if(require(dplyr)){
        print("dplyr installed and loaded")
    } else {
        stop("could not install dplyr")
    }
}

## non-CRAN packages

## to make density plots showing gene expression
if(require("Nebulosa")){
    print("Nebulosa is loaded correctly")
} else {
    print("trying to install Nebulosa")
    devtools::install_github("powellgenomicslab/Nebulosa")
    if(require(Nebulosa)){
        print("Nebulosa installed and loaded")
    } else {
        stop("could not install Nebulosa")
    }
}

## monocle3 to calculate pseudotime:
if(require("monocle3")){
    print("monocle3 is loaded correctly")
} else {
    print("Please install monocle3 (https://cole-trapnell-lab.github.io/monocle3/docs/installation/)")
}

## set the seed for both the mixture models and also for the sample function later on:
set.seed(-92497)
```

### Read in the Data

screen hits
```{r}
## EDIT - change this to the excel table once we have it finalized for the screen
screen_hits <- c("PBANKA-0516300",
"PBANKA-1217700",
"PBANKA-0409100",
"PBANKA-1034300",
"PBANKA-1437500",
"PBANKA-0827500",
"PBANKA-0824300",
"PBANKA-1426900",
"PBANKA-0105300",
"PBANKA-0921100",
"PBANKA-1002400",
"PBANKA-0829400",
"PBANKA-1347200",
"PBANKA-0828000",
"PBANKA-0902300",
"PBANKA-1418100",
"PBANKA-1435200",
"PBANKA-1454800",
"PBANKA-0712300",
"PBANKA-0410500",
"PBANKA-1144800",
"PBANKA-1231600",
"PBANKA-0503200",
"PBANKA-0308900",
"PBANKA-1214700",
"PBANKA-0709900",
"PBANKA-0311900",
"PBANKA-0716500",
"PBANKA-1447900",
"PBANKA-0102200",
"PBANKA-0713500",
"PBANKA-0102400",
"PBANKA-1302700",
"PBANKA-1235900",
"PBANKA-0401100",
"PBANKA-0413400",
"PBANKA-1126900",
"PBANKA-1425900",
"PBANKA-0418300",
"PBANKA-1464600",
"PBANKA-0806000")
```

Read in gene annotations
```{r}
gene_annotations <- read.table("../data/Reference/GenesByTaxon_Summary.csv", header = TRUE, sep = ",", stringsAsFactors = TRUE)
dim(gene_annotations)

## convert _ to -
gene_annotations$Gene.ID <- gsub("_", "-", gene_annotations$Gene.ID)
```

load in datasets
```{r}
## load the 10X dataset
pb_sex_filtered <- readRDS("../data_to_export/pb_sex_filtered.RDS")
## load the SS2 dataset
ss2_mutants_final <- readRDS("../data_to_export/ss2_mutants_final.RDS")

## inspect
paste("10x dataset")
pb_sex_filtered
paste("Smart-seq2 dataset")
ss2_mutants_final
paste("The composition of the Smart-seq2 dataset is:")
table(ss2_mutants_final@meta.data$genotype)
```

# 3. Merging the Smart-seq2 and 10X Data {.tabset}

### Prepare data

```{r integration 10x setup}
## extract 10x data
tenx_5k_counts <- as.matrix(pb_sex_filtered@assays$RNA@counts)
tenx_5k_pheno <- pb_sex_filtered@meta.data

## Create fresh object
tenx_5k_counts_to_integrate <- CreateSeuratObject(counts = tenx_5k_counts, meta.data = tenx_5k_pheno, min.cells = 0, min.features = 0, project = "GCSKO")

## add experiment meta data
tenx_5k_counts_to_integrate@meta.data$experiment <- "tenx_5k"

## inspect
tenx_5k_counts_to_integrate
```

We need to make sure the mutant data is compatible with the 10X data. the 10X data has fewer genes represented so we need to find the intersect of the two before integration.
```{r integration ss2 setup}
## extract SS2 data 
mutant_counts_for_integration <- as.matrix(ss2_mutants_final@assays$RNA@counts)
mutant_pheno_for_integration <- ss2_mutants_final@meta.data

## change counts so the :rRNA and :tRNA are not there:
rownames(mutant_counts_for_integration) <- gsub(":ncRNA", "", gsub(":rRNA", "", gsub(":tRNA", "", rownames(mutant_counts_for_integration))))

## change the gene names so that they are - rather than _:
rownames(mutant_counts_for_integration) <- gsub("_", "-", rownames(mutant_counts_for_integration))

## calculate how many of the genes overlap - 10x does start out with 5098 vs 5245
genes_in_tenx_dataset <- intersect(rownames(tenx_5k_counts), rownames(mutant_counts_for_integration))
## print number of genes that overlap
dim(mutant_counts_for_integration)
## subset the mutant counts to contain only 10x genes
mutant_counts_for_integration <- mutant_counts_for_integration[which(rownames(mutant_counts_for_integration) %in% genes_in_tenx_dataset), ]
## print result of genes that overlap
dim(mutant_counts_for_integration)

## make Seurat object:
GCSKO_mutants <- CreateSeuratObject(counts = mutant_counts_for_integration, meta.data = mutant_pheno_for_integration, min.cells = 0, min.features = 0, project = "GCSKO")

## add experiment meta data
GCSKO_mutants@meta.data$experiment <- "mutants"

## inspect
GCSKO_mutants
```

```{r}
## double check that this is the same number of genes
## subset counts so that only genes represented in the other two objects are there:
length(intersect(rownames(tenx_5k_counts), rownames(mutant_counts_for_integration)))
```

IMPORTANT - this next step is different to GCSKO_merge as it subsets the smart-seq2 data into wild-type only. 
```{r}
## subset wt on ss2 data:
ss2_wt_cells <- rownames(GCSKO_mutants@meta.data[GCSKO_mutants@meta.data$genotype == "WT", ])
GCSKO_mutants_wtonly <- subset(GCSKO_mutants, cells = ss2_wt_cells)
```

create list and normalise:
```{r integration normalise}
## make list
tenx.justwt.list <- list(tenx_5k_counts_to_integrate, GCSKO_mutants_wtonly)

## prepare data
for (i in 1:length(tenx.justwt.list)) {
    tenx.justwt.list[[i]] <- NormalizeData(tenx.justwt.list[[i]], verbose = FALSE)
    tenx.justwt.list[[i]] <- FindVariableFeatures(tenx.justwt.list[[i]], selection.method = "vst", 
        nfeatures = 2000, verbose = FALSE)
    all.genes <- rownames(tenx.justwt.list[[i]])
    tenx.justwt.list[[i]] <- ScaleData(tenx.justwt.list[[i]], features = all.genes)
}
```

### Integrate objects

```{r integration}
## Find anchors
tenx.justwt.anchors <- FindIntegrationAnchors(object.list = tenx.justwt.list, dims = 1:21, verbose = FALSE)

## Integrate data
tenx.justwt.integrated <- IntegrateData(anchorset = tenx.justwt.anchors, dims = 1:21, verbose = FALSE, features.to.integrate = genes_in_tenx_dataset)
```

# 4. Dimensionality reduction {.tabset}

### PCA
```{r, fig.width = 10, fig.height = 5}
## Make the default assay integrated
DefaultAssay(tenx.justwt.integrated) <- "integrated"

## Run the standard workflow for visualization and clustering
tenx.justwt.integrated <- ScaleData(tenx.justwt.integrated, verbose = FALSE)
tenx.justwt.integrated <- RunPCA(tenx.justwt.integrated, npcs = 30, verbose = FALSE)

## inspect PCs
ElbowPlot(tenx.justwt.integrated, ndims = 30, reduction = "pca")
```

### Optimised UMAP
After optimisation, the following UMAP can be calculated:
```{r umap run 2, fig.height = 7, fig.width = 7}
## Run optimised UMAP
tenx.justwt.integrated <- RunUMAP(tenx.justwt.integrated, reduction = "pca", dims = 1:10, n.neighbors = 50, seed.use = 1234, min.dist = 0.4, repulsion.strength = 0.03, local.connectivity = 150)
```

```{r umap visualise 2}
## plot
dp1 <- DimPlot(tenx.justwt.integrated, label = TRUE, repel = FALSE, pt.size = 0.05, dims = c(1,2), split.by = "experiment") + 
  ## fix the axis
  coord_fixed() #+ 
  ## reverse the scale
  #scale_y_reverse()

## view
dp1
```

Make final plots:
```{r, fig.height = 10, fig.width = 10}

## split seurat object up
ob.list <- SplitObject(tenx.justwt.integrated, split.by = "experiment")

## make plots for each object
plot.list <- lapply(X = ob.list, FUN = function(x) {
    DimPlot(x, dims = c(1,2), pt.size = 1) + theme(legend.position="bottom")
})

composition_umap_10x <- plot.list$`tenx_5k` + 
  coord_fixed() +
  theme_void() +
  scale_color_manual(values=c(replicate(45, "#999999"))) +
  labs(title = paste("10x (wild-type)")) +
  theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))

composition_umap_ss2 <- plot.list$mutants +
  coord_fixed() +
  theme_void() +
  scale_color_manual(values=c(replicate(46, "#999999"))) +
  labs(title = paste("Smart-seq2 (wild-type)")) +
  theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))

composition_umap <- composition_umap_10x + composition_umap_ss2 

composition_umap
```

save
```{r}
ggsave("../images_to_export/composition_umap.png", plot = composition_umap, device = "png", path = NULL, scale = 1, width = 20, height = 20, units = "cm", dpi = 300, limitsize = TRUE)
```

```{r}
## make plots
## hoo dataset correlation
UMAP_hoo <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "Prediction.Spearman.", dims = c(1,2)) +
  coord_fixed() + 
  theme_void() +
  labs(title = paste("Hoo Predicted Timepoint")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
  scale_colour_manual(values = inferno(12))  +
  labs(colour = "hour") +
  theme(legend.position = "bottom", legend.title=element_text(size=10))

## ap2g timecourse in this paper correlation
UMAP_kasia <- DimPlot(tenx.justwt.integrated, pt.size = 0.1, group.by = "Prediction.Spearman._Kasia", dims = c(1,2)) +
  coord_fixed() + 
  theme_void() +
  labs(title = paste("AP2G Timecourse Predicted Timepoint")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
  scale_colour_manual(values = inferno(10))  +
  labs(colour = "hour") +
  theme(legend.position = "bottom", legend.title=element_text(size=10))

## combine
umap_bulk <- wrap_plots(UMAP_hoo, UMAP_kasia, ncol = 2)

## print
umap_bulk
```

# 5. Clustering {.tabset} 

### Generate clusters

```{r}
## generate new clusters at low resolution
tenx.justwt.integrated <- FindNeighbors(tenx.justwt.integrated, dims = 1:19, reduction = "pca")
tenx.justwt.integrated <- FindClusters(tenx.justwt.integrated, resolution = 2, random.seed = 42, algorithm = 2)
```

### visualise clusters

```{r}
DimPlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE, repel = FALSE, label = TRUE)
```

Make individual plots highlighting where cells in each cluster fall
```{r, echo = FALSE, message=FALSE}
## for loop which takes each cluster and makes a list of cells and then plots a highlighted plot and adds it to a list

## make a blank list
list_UMAPs_by_cluster <- vector(mode = "list", length = length(levels(tenx.justwt.integrated@meta.data$integrated_snn_res.2)))

## for loop
for(i in seq_along(levels(tenx.justwt.integrated@meta.data$integrated_snn_res.2))){
  ## make a list of cells
  list_of_cells <- rownames(tenx.justwt.integrated@meta.data[which(tenx.justwt.integrated@meta.data$integrated_snn_res.2 == levels(tenx.justwt.integrated@meta.data$integrated_snn_res.2)[i]), ])
  umap_plot <- DimPlot(tenx.justwt.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = list_of_cells, dims = c(1,2), reduction = "umap") +
    ## fix coordinates
    coord_fixed() + 
  scale_color_manual(values=c("#D3D3D3", "#1D1564")) + 
  theme_void() + 
  labs(title = paste("cluster", levels(tenx.justwt.integrated@meta.data$integrated_snn_res.2)[i])) + 
  theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
  ## add to the list
  list_UMAPs_by_cluster[[i]] <- umap_plot
}

## check number of clusters
length(list_UMAPs_by_cluster)
```

plot
```{r, fig.height = 10, fig.width = 10}
## this function writes the next bit of code for you
## put it into the console and paste the response
#ploty <- c()
#for(i in seq_along(levels(tenx.justwt.integrated@meta.data$seurat_clusters))){
#  ploty <- paste0(ploty, "list_UMAPs_by_cluster[[", i, "]]", " + ")
#}

## plot
list_UMAPs_by_cluster[[1]] + list_UMAPs_by_cluster[[2]] + list_UMAPs_by_cluster[[3]] + list_UMAPs_by_cluster[[4]] + list_UMAPs_by_cluster[[5]] + list_UMAPs_by_cluster[[6]] + list_UMAPs_by_cluster[[7]] + list_UMAPs_by_cluster[[8]] + list_UMAPs_by_cluster[[9]] + list_UMAPs_by_cluster[[10]] + list_UMAPs_by_cluster[[11]] + list_UMAPs_by_cluster[[12]] + list_UMAPs_by_cluster[[13]] + list_UMAPs_by_cluster[[14]] + list_UMAPs_by_cluster[[15]] + list_UMAPs_by_cluster[[16]] + list_UMAPs_by_cluster[[17]] + list_UMAPs_by_cluster[[18]] + list_UMAPs_by_cluster[[19]] + list_UMAPs_by_cluster[[20]] + list_UMAPs_by_cluster[[21]] + list_UMAPs_by_cluster[[22]] + list_UMAPs_by_cluster[[23]] + list_UMAPs_by_cluster[[24]]
```

### Find Markers

```{r}
DefaultAssay(tenx.justwt.integrated) <- "RNA"
# find markers for every cluster compared to all remaining cells, report only the positive ones
pb.markers <- FindAllMarkers(tenx.justwt.integrated, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
pb.markers %>% group_by(cluster) %>% top_n(n = 5, wt = avg_logFC)
```

```{r, R.options = list(width = 300)}
markers_subset <- pb.markers %>% group_by(cluster) %>% top_n(n = 20, wt = avg_logFC)
markers_subset_annotated <- merge(markers_subset, gene_annotations,  by.x = "gene", by.y = "Gene.ID", all = FALSE)
markers_subset_annotated
View(markers_subset_annotated)
```

### visualise markers

```{r, fig.height = 10, fig.width = 10}
## get the top 10 for each cluster
top10 <- pb.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_logFC)

## scale data is missing - but original dfs have scale.data
# df_scaled <- as.matrix(transform(merge(as.data.frame(tenx.justwt.list[[1]]@assays$RNA@scale.data), as.data.frame(tenx.justwt.list[[2]]@assays$RNA@scale.data), by = 0, all=FALSE), row.names=Row.names, Row.names=NULL))
# tenx.justwt.integrated@assays$RNA@scale.data <- df_scaled

## scale data in RNA slot
all.genes <- rownames(tenx.justwt.integrated)
tenx.justwt.integrated <- ScaleData(tenx.justwt.integrated, vars.to.regress = 'experiment', features = all.genes)

## heatmap
DoHeatmap(tenx.justwt.integrated, features = top10$gene) + NoLegend()
```

# 6. Define Cluster Identities {.tabset} 

### Downsampling
```{r}
# ## check number of cells in each cluster
# df_clusters <- as.data.frame(table(tenx.justwt.integrated@meta.data$integrated_snn_res.2))
# plot(df_clusters)
# 
# ## downsample using appropriate metrics
# tenx.justwt.integrated.downsampled <- subset(tenx.justwt.integrated, downsample = 100)
#   
# ## inspect plot
# FeaturePlot(tenx.justwt.integrated.downsampled, features = "PBANKA-1437500", coord.fixed = TRUE, dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
#   theme_void() + 
#   labs(title = paste("AP2G (Commitment)")) + 
#   theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
#   scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
```

### Marker Genes

useful tools for all plots
```{r}
## define male and female symbol
female_symbol <- intToUtf8(9792)
male_symbol <- intToUtf8(9794)

## define colour pal
marker_gene_ramp <- colorRampPalette(c("#D3D3D3", "#1D1564"))(50)
```

#### Expression - cutoffs - purple

```{r, fig.height = 15, fig.width = 15}

marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1319500", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(ccp2)~(Female))) + 
                          theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=6.0), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, 
                                    features = "PBANKA-0416100", 
                                    coord.fixed = TRUE, 
                                    min.cutoff = "q05", 
                                    max.cutoff = "q95", 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') +
                        coord_fixed() + 
                        theme_void() + 
                        labs(title = expression(italic(mg1)~(Male))) + 
                        theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                        scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1437500", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(ap2g)~(Commitment))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-0831000", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(msp1)~(Schizont))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1102200", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE, 
                                     slot = 'scale.data') + 
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(msp8)~(Asexual))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-1101300", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE,
                                     slot = 'scale.data') +
                          coord_fixed() + 
                          theme_void() + 
                          labs(title = expression(italic(SBP1)~(Ring))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-0722600", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE,
                                     slot = 'scale.data') +
                          coord_fixed() + 
                          theme_void() + 
                          labs(title = expression(italic(fam-b2)~(Ring))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_ORC1 <- FeaturePlot(tenx.justwt.integrated, 
                                     features = "PBANKA-0602000", 
                                     coord.fixed = TRUE, 
                                     min.cutoff = "q05", 
                                     max.cutoff = "q95", 
                                     dims = c(1,2), 
                                     reduction = "umap", 
                                     pt.size = 0.5, 
                                     order = TRUE,
                                     slot = 'scale.data') +
                          coord_fixed() + 
                          theme_void() + 
                          labs(title = expression(italic(orc1)~(DNA~replication))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_CDPK5 <- FeaturePlot(tenx.justwt.integrated, 
                                      features = "PBANKA-1351500", 
                                      coord.fixed = TRUE, 
                                      min.cutoff = "q05", 
                                      max.cutoff = "q95", 
                                      dims = c(1,2), 
                                      reduction = "umap", 
                                      pt.size = 0.5, 
                                      order = TRUE, 
                                      slot = 'scale.data') +
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(cdpk5)~(Schizont))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

marker_gene_plot_MCM4 <- FeaturePlot(tenx.justwt.integrated, 
                                      features = "PBANKA-1415600", 
                                      coord.fixed = TRUE, 
                                      min.cutoff = "q05", 
                                      max.cutoff = "q95", 
                                      dims = c(1,2), 
                                      reduction = "umap", 
                                      pt.size = 0.5, 
                                      order = TRUE, 
                                      slot = 'scale.data') +
                          coord_fixed() +
                          theme_void() + 
                          labs(title = expression(italic(mcm4)~(DNA~replication))) + 
                          theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 5), text=element_text(size=6.0), plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))

# + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)

marker_gene_plot_all
```

### Mutant Genes

#### Expression - with cutoffs

```{r, fig.height = 10, fig.width = 10}
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD4 
# PBANKA-0716500        GCSKO-19  MD5 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD3
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5


marker_gene_plot_fd3 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-1418100", 
                                   coord.fixed = TRUE,
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95", 
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                        theme_void() + 
                        labs(title = expression(italic(fd3))) + 
                        theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                        ## add sex symbols
                        #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                        #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md4 <- FeaturePlot(tenx.justwt.integrated, 
                                  dims = c(1,2), 
                                  reduction = "umap", 
                                  features = "PBANKA-0102400", 
                                  coord.fixed = TRUE, 
                                  min.cutoff = "q05",
                                  max.cutoff = "q95", 
                                  pt.size = 0.5, 
                                  order = TRUE, 
                                  slot = 'scale.data') + 
                      theme_void() + 
                      labs(title = expression(italic(md4))) + 
                      theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                      ## add sex symbols
                      #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                      #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md5 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-0716500", 
                                   coord.fixed = TRUE, 
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95",
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md5))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = "")) 
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_fd4 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-1435200", 
                                   coord.fixed = TRUE, 
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95", 
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(fd4))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = "")) 
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_fd2 <- FeaturePlot(tenx.justwt.integrated, 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    features = "PBANKA-0902300", 
                                    coord.fixed = TRUE, 
                                    min.cutoff = "q05", 
                                    max.cutoff = "q95", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(fd2))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       # + annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md3 <- FeaturePlot(tenx.justwt.integrated, 
                                   dims = c(1,2), 
                                   reduction = "umap", 
                                   features = "PBANKA-0413400", 
                                   coord.fixed = TRUE, 
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95",
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md3))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                      #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_gd1 <- FeaturePlot(tenx.justwt.integrated, 
                                  dims = c(1,2), 
                                  reduction = "umap",
                                  features = "PBANKA-0828000", 
                                  coord.fixed = TRUE, 
                                  min.cutoff = "q05", 
                                  max.cutoff = "q95",
                                  pt.size = 0.5, 
                                  order = TRUE, 
                                  slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(gd1))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md1 <- FeaturePlot(tenx.justwt.integrated, 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    features = "PBANKA-1302700",
                                    coord.fixed = TRUE,
                                    min.cutoff = "q05", 
                                    max.cutoff = "q95", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md1))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_md2 <- FeaturePlot(tenx.justwt.integrated,
                                   dims = c(1,2), 
                                   reduction = "umap",
                                   features = "PBANKA-1447900",
                                   coord.fixed = TRUE,
                                   min.cutoff = "q05", 
                                   max.cutoff = "q95", 
                                   pt.size = 0.5, 
                                   order = TRUE, 
                                   slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(md2))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_fd1 <- FeaturePlot(tenx.justwt.integrated, 
                                    dims = c(1,2), 
                                    reduction = "umap", 
                                    features = "PBANKA-1454800", 
                                    coord.fixed = TRUE, 
                                    min.cutoff = "q05",
                                    max.cutoff = "q95", 
                                    pt.size = 0.5, 
                                    order = TRUE, 
                                    slot = 'scale.data') + 
                       theme_void() + 
                       labs(title = expression(italic(fd1))) + 
                       theme(plot.title = element_text(hjust = 0.5), 
                                legend.text = element_text(size = 5), 
                                text=element_text(size=7), 
                                plot.margin = unit(c(0, 0, 0, 0), "cm")
                                ) + 
                          scale_color_continuous_sequential(palette = "Purples 2", labels = number_format(accuracy = 0.1)) +
                          guides(colour = guide_colourbar(barwidth = 0.3, barheight = 3.0, title = ""))
                       ## add sex symbols
                       #annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
                       #annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))
# scale_color_continuous_sequential(palette = "Purples 2")
# scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_gd1 , marker_gene_plot_md1 , marker_gene_plot_md2 , marker_gene_plot_md3 , marker_gene_plot_md4 , marker_gene_plot_md5 , marker_gene_plot_fd1 , marker_gene_plot_fd2 , marker_gene_plot_fd3 , marker_gene_plot_fd4, ncol = 4)
           
## print
mutant_expression_composite
```

save
```{r}
ggsave("../images_to_export/ALL_CELLS_wt_gene_expression.png", plot = mutant_expression_composite, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
```

```{r, fig.height = 10, fig.width = 10}
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD5
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5


marker_gene_plot_17 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1418100", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_2 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0102400", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_19 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0716500", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_20 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1435200", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_13 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0902300", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_10 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0413400", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md5")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_3 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0828000", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("gd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_oom <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1302700", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_29 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1447900", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_21 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1454800", coord.fixed = TRUE, min.cutoff = "q10", max.cutoff = "q90", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))

## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_17 , marker_gene_plot_2 , marker_gene_plot_19 , marker_gene_plot_20 , marker_gene_plot_13 , marker_gene_plot_10 , marker_gene_plot_3 , marker_gene_plot_oom , marker_gene_plot_29 , marker_gene_plot_21 , ncol = 4)
           
## print
mutant_expression_composite
```



# 7. Pseudotime on all cells {.tabset}

### Pseudotime calculation

```{r}
## extract data from Seurat
seurat.object.all <- tenx.justwt.integrated
# counts
data <- as(as.matrix(GetAssayData(seurat.object.all, assay = "integrated", slot = "data")), 'sparseMatrix')
# meta data
pd <- data.frame(seurat.object.all@meta.data)

## keep only the columns that are relevant
#pData <- pd %>% select(orig.ident, nCount_RNA, nFeature_RNA)
## add gene short name
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))

## Construct monocle cds
monocle.object.all <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)
## preprocess
monocle.object.all = preprocess_cds(monocle.object.all, num_dim = 100, norm_method = "none")
## plot variance explained plot
#plot_pc_variance_explained(monocle.object.all)
## make monocle UMAP
#monocle.object.all = reduce_dimension(monocle.object.all, reduction_method = "UMAP", preprocess_method = "PCA", umap.metric = "euclidean", umap.n_neighbors = 20, umap.min_dist = 0.5, verbose = FALSE)
#plot_cells(monocle.object.all)

## add UMAP from Seurat
monocle.object.all@int_colData@listData$reducedDims@listData[["UMAP"]] <-seurat.object.all@reductions[["umap"]]@cell.embeddings 
plot_cells(monocle.object.all)

## cluster
monocle.object.all = cluster_cells(monocle.object.all)

## plot clusters
plot_cells(monocle.object.all, color_cells_by="partition", group_cells_by="partition",  x = 1, y = 2)

## reduce partitions to 1
monocle.object.all@clusters$UMAP$partitions[monocle.object.all@clusters$UMAP$partitions == "2"] <- "1"

#map pseudotime
monocle.object.all = learn_graph(monocle.object.all, learn_graph_control=list(ncenter=550, minimal_branch_len = 15), use_partition = FALSE)

plot_cells(monocle.object.all, color_cells_by="partition", group_cells_by="partition",  x = 1, y = 2)

## a helper function to identify the root principal points:
## make cluster 2 the root
# get_earliest_principal_node <- function(cds, time_bin="7"){
#   cell_ids <- which(colData(cds)[, "seurat_clusters"] == time_bin)
#   closest_vertex <-
#   cds@principal_graph_aux[["UMAP"]]$pr_graph_cell_proj_closest_vertex
#   closest_vertex <- as.matrix(closest_vertex[colnames(cds), ])
#   root_pr_nodes <-
#   igraph::V(principal_graph(cds)[["UMAP"]])$name[as.numeric(names
#   (which.max(table(closest_vertex[cell_ids,]))))]
#   
#   root_pr_nodes
# }

## calculate pseudotime
#monocle.object.all = order_cells(monocle.object.all, root_pr_nodes=get_earliest_principal_node(monocle.object.all))
monocle.object.all = order_cells(monocle.object.all)
## used 5 points at the beginning


## plot
umap_pt <- plot_cells(monocle.object.all, color_cells_by = "pseudotime", label_cell_groups=FALSE, cell_size = 1, x = 1, y = 2, label_branch_points=FALSE, label_leaves=FALSE, label_groups_by_cluster=FALSE, label_roots = FALSE) +
  coord_fixed() +
  theme_void() +
  labs(title = "") +
  theme(plot.title = element_text(hjust = 0.5, size=20), legend.position="bottom", legend.title=element_text (size=20), legend.text=element_text(size=20)) + 
  guides(colour = guide_colourbar(barwidth = 10, barheight = 2, title = "Pseudotime"))

## view plot
umap_pt

## help was obtained from here
## https://github.com/satijalab/seurat/issues/1658
```

save
```{r}
ggsave("../images_to_export/UMAP_pt_wt.png", plot = umap_pt, device = "png", path = NULL, scale = 1, width = 20, height = 20, units = "cm", dpi = 300, limitsize = TRUE)
```

### gganimnate GIF of pseuodtime

```{r}
#install.packages("gganimate")
library(gganimate)
#install.packages("gifski")
#install.packages("av")
#library(gifski)
#library(av)

## make dataframe for plotting
## extract data for GGplot version of this
df_animation <- as.data.frame(monocle.object.all@int_colData@listData$reducedDims@listData[["UMAP"]])
## add pt to this data frame:
pt_values <- as.data.frame(pseudotime(monocle.object.all, reduction_method = "UMAP"))
df_animation <- merge(df_animation, pt_values, by="row.names") 
rownames(df_animation) <- df_animation$Row.names
colnames(df_animation)[4] <- "pt"

## make the static plot
p <- ggplot(df_animation, aes(x = UMAP_1, y = UMAP_2, colour = pt)) +
  geom_point() +
  scale_colour_viridis_c(option = "plasma") +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none")
## view plot
plot(p)

## make animated plot
## make a category for animation
#df_animation$group <- cut(df_animation$pt, 15)

anim <- p +
  transition_time(pt) +
  shadow_mark()

animate(anim, height = 3, width = 3, units = "in", res = 150, bg = 'transparent')

## to change the resolution - https://stackoverflow.com/questions/49058567/define-size-for-gif-created-by-gganimate-change-dimension-resolution 
```
Save animation
```{r}
anim_save("animated_UMAP_transparent_bg_wt.gif", path = "../images_to_export/")
```

```{r}
## extract pt values
pt_values <- as.data.frame(pseudotime(monocle.object.all, reduction_method = "UMAP"))

tenx.justwt.integrated <- AddMetaData(tenx.justwt.integrated, pt_values, "old_pt_values")
```

#### Define cell identities with branches

Define identities of cells 

male
```{r}
monocle.object_male <- choose_graph_segments(monocle.object.all)
```
female
```{r}
monocle.object_female <- choose_graph_segments(monocle.object.all)
```
bipotential
```{r}
monocle.object_bipot <- choose_graph_segments(monocle.object.all)
```
asexual (pre-branch)
```{r}
monocle.object_asex_pre <- choose_graph_segments(monocle.object.all)
```
asexual fate
```{r}
monocle.object_asex_fate <- choose_graph_segments(monocle.object.all)
```
check
```{r}
df_freq <- data.frame(table(c(colnames(monocle.object_male), colnames(monocle.object_female), colnames(monocle.object_bipot), colnames(monocle.object_asex_pre), colnames(monocle.object_asex_fate))))
paste("number of cells in seurat object is", length(colnames(monocle.object.all)), ". The number of cells selected here with an identitity is", dim(df_freq)[1])
df_freq <- df_freq[df_freq$Freq > 1, ]
df_freq
```
Inspect where these missing cells are:
```{r}
# '%ni%' <- Negate('%in%')
# 
# not_assigned_cells <- colnames(monocle.object.all)[colnames(monocle.object.all) %ni% c(colnames(monocle.object_male), colnames(monocle.object_female), colnames(monocle.object_bipot), colnames(monocle.object_asex), colnames(monocle.object_asex_fate))]
# 
# DimPlot(seurat.object, repel = TRUE, label.size = 5, pt.size = 0.5, cells.highlight = not_assigned_cells, dims = c(2,1), reduction = "DIM_UMAP") +
#   coord_fixed() + 
#   scale_color_manual(values=c("#000000", "#f54e1e"))
```

```{r}
## create annotation dataframe from these results:
df_monocle_sexes <- rbind(data.frame("cell_name" = colnames(monocle.object_male), "sex" = rep("Male", length(colnames(monocle.object_male)))),
                          data.frame("cell_name" = colnames(monocle.object_female), "sex" = rep("Female", length(colnames(monocle.object_female)))),
                          data.frame("cell_name" = colnames(monocle.object_bipot), "sex" = rep("Bipotential", length(colnames(monocle.object_bipot)))),
                          data.frame("cell_name" = colnames(monocle.object_asex_pre), "sex" = rep("Asexual_Early", length(colnames(monocle.object_asex_pre)))),
                          data.frame("cell_name" = colnames(monocle.object_asex_fate), "sex" = rep("Asexual_Late", length(colnames(monocle.object_asex_fate))))
                          #data.frame("cell_name" = not_assigned_cells, "sex" = rep("Unassigned", length(not_assigned_cells)))
                          )

dim(df_monocle_sexes)

## order like the metadata
df_monocle_sexes <- df_monocle_sexes[match(rownames(monocle.object.all@colData), df_monocle_sexes$cell_name), ]

## add this back into the monocle object
monocle.object.all@colData$Sexes_monocle <- df_monocle_sexes$sex

## add this to the seurat object
rownames(df_monocle_sexes) <- df_monocle_sexes$cell_name
df_monocle_sexes_to_add_to_seurat <- df_monocle_sexes[,c("sex"), drop = FALSE]
tenx.justwt.integrated <- AddMetaData(tenx.justwt.integrated, df_monocle_sexes_to_add_to_seurat, col.name = "monocle_sex")
```

pt vs. real time
```{r}
library(ggridges)

df_plotting <- as.data.frame(tenx.justwt.integrated@meta.data)

pt_real_plot <- ggplot(df_plotting, aes(x = old_pt_values, y =  as.factor(Prediction.Spearman.), fill = as.factor(Prediction.Spearman.))) +
  geom_density_ridges(alpha = .1,
                      rel_min_height = 0.03,  
                               quantile_lines = TRUE, 
                               quantiles = 2,
                      scale = 2) + # plot rigdelines without points
  # plot points with invisible ridgelines
  geom_density_ridges_gradient(point_alpha = 1, 
                               alpha = 0,
                               colour = NA,
                               na.rm = FALSE, 
                               jittered_points = TRUE,
                               aes(point_color = monocle_sex)
                               )  +
  theme_classic() +
  scale_y_discrete(drop=FALSE) +
  labs(title = '', x = "Pseudotime", y = "Predicted Real-time (hpi)") +
  #scale_fill_viridis_c(name = "pseudotime", option = "C", alpha = 1) +
  theme(legend.position = "bottom",
        panel.grid.major.y = element_line(size = (0.2), colour="grey")
        )

pt_real_plot
```

```{r}
gg <- ggplot(df_plotting, aes(as.factor(Prediction.Spearman.), fill = monocle_sex)) + geom_bar(stat = "count", position="fill")

gg
```


# 8. Plots {.tabset}

make composite pseudotime/ID figure

```{r}
# 1 = blue - "#0052c5"
# 2 = red - "#a52b1e"
# 3 = green - "#016c00"
# 4 = yellow - "#ffe400"
#pal_sex <- c("#0052c5","#ffe400", "#a52b1e", "#016c00")

## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex")]

## inspect possible values
list_of_sexes <- names(table(df_pt_id$monocle_sex))

## make a new column
df_pt_id$colour <- NA

## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))

## re-classify the cells that are unassigned cells removed from sexual branch above:
#df_pt_id[which(rownames(df_pt_id) %in% remove_cells), ]$monocle_sex <- "Asexual"

## assign values to each cluster
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values 
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- asex_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))]

df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- male_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))]

df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- female_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))]

df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- bipot_ramp(100)[as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))]

## check everything has a value
#table(is.na(df_pt_id$colour))

## make into a df
#df_pt_id <- df_pt_id[ ,"colour", drop = FALSE]

## add back to seurat object
df <- df_pt_id[ ,"colour", drop = FALSE]
tenx.justwt.integrated <- AddMetaData(tenx.justwt.integrated, df, "pt_id_cols")
rm(df)

## plot
## extract UMAP coords
df_umap_plot <- tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings
df_umap_plot <- merge(df_umap_plot, df_pt_id, by=0, all=TRUE)

## add tree
##The tree for monocle is located here:
# monocle.object@principal_graph_aux[["UMAP"]]$dp_mst 
ica_space_df <- t(monocle.object.all@principal_graph_aux[["UMAP"]]$dp_mst) %>%
      as.data.frame() %>%
      dplyr::select_(prin_graph_dim_1 = "UMAP_1", prin_graph_dim_2 = "UMAP_2") %>%
      dplyr::mutate(sample_name = rownames(.),
                    sample_state = rownames(.))

dp_mst <- monocle.object.all@principal_graph[["UMAP"]]

edge_df <- dp_mst %>%
      igraph::as_data_frame() %>%
      dplyr::select_(source = "from", target = "to") %>%
      dplyr::left_join(ica_space_df %>%
                         dplyr::select_(
                           source="sample_name",
                           source_prin_graph_dim_1="prin_graph_dim_1",
                           source_prin_graph_dim_2="prin_graph_dim_2"),
                       by = "source") %>%
      dplyr::left_join(ica_space_df %>%
                         dplyr::select_(
                           target="sample_name",
                           target_prin_graph_dim_1="prin_graph_dim_1",
                           target_prin_graph_dim_2="prin_graph_dim_2"),
                       by = "target")

## make ggplot
umap_id_pt <- ggplot(df_umap_plot, aes(x = UMAP_1, y = UMAP_2)) + 
                     geom_point(col = df_umap_plot$colour) +
                     theme_void() +
                     coord_fixed() +
                     geom_segment(aes_string(x="source_prin_graph_dim_1",
                                     y="source_prin_graph_dim_2",
                                     xend="target_prin_graph_dim_1",
                                     yend="target_prin_graph_dim_2"),
                          data=edge_df)

umap_id_pt
```

save
```{r}
#ggsave("../images_to_export/umap_id_pt.png", plot = umap_id_pt, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
```
cluster plot
```{r}
### Prepare data

## extract df with cluster ID, 
df_pt_id <- tenx.justwt.integrated@meta.data[ ,"seurat_clusters", drop=FALSE]
## extract UMAP coords
df_plot <- tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings
df_plot <- merge(df_plot, df_pt_id, by=0, all=TRUE)
## add colour col
df_plot$colour <- df_plot$seurat_clusters
levels(df_plot$colour) <- list(print(asex_ramp(17)[1]) = "9", asex_ramp(17)[2]="4", 
                               asex_ramp(17)[3]="15", 
                                asex_ramp(17)[4] = "8", 
                                asex_ramp(17)[5]= "1",
                                asex_ramp(17)[6]=  "14", 
                                asex_ramp(17)[7] = "2", 
                                asex_ramp(17)[8] = "10",
                                asex_ramp(17)[9] = "3",
                                asex_ramp(17)[10] = "0",
                                asex_ramp(17)[11] = "6",
                                asex_ramp(17)[12] = "5",
                                asex_ramp(17)[13] = "7",
                                asex_ramp(17)[14] = "12",
                                asex_ramp(17)[15] = "18",
                                asex_ramp(17)[16] = "20",
                                asex_ramp(17)[17] = "23",
                                bipot_ramp(3)[2] = "11",
                                male_ramp(3)[2] = "16",
                                male_ramp(3)[3] = "13",
                                female_ramp(4)[2] = "21",
                                female_ramp(4)[3] = "22",
                                female_ramp(4)[4] = "19",
                                female_ramp(4)[4] = "17"
                                      )
## aabbreviate clusters
df_plot$cluster_names <- df_plot$seurat_clusters
levels(df_plot$cluster_names) <- list(A_1="9", 
                                      A_2="4", 
                                      A_3="15", 
                                      A_4 = "8", 
                                      A_5 = "1",
                                      A_6=  "14", 
                                      A_7 = "2", 
                                      A_8 = "10",
                                      A_9 = "3",
                                      A_10 = "0",
                                      A_11 = "6",
                                      A_12 = "5",
                                      A_13 = "7",
                                      A_14 = "12",
                                      A_15 = "18",
                                      A_16 = "20",
                                      A_17 = "23",
                                      B = "11",
                                      M_1 = "16",
                                      M_2 = "13",
                                      F_1 = "21",
                                      F_2 = "22",
                                      F_3 = "19",
                                      F_3 = "17"
                                      )

library(dplyr)
dat%>%
group_by(custid)%>% 
summarise(Mean=mean(value), Max=max(value), Min=min(value), Median=median(value), Std=sd(value))



## plot
umap_id <- ggplot(df_plot, aes(x = UMAP_1, y = UMAP_2)) + 
                     geom_point(col = df_plot$colour) +
                     theme_void() +
                     coord_fixed()

umap_id
```

```{r}
## these get defined later on, but are replicated above here for plotting
asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11) 
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)

## make a new column for plotting
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- tenx.justwt.integrated@meta.data$seurat_clusters

## reorder the levels so you can plot the cluters as you wish
my_levels <- c(asexual_early_clusters,asexual_late_clusters, bipotential_clusters, male_clusters, female_clusters)
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- factor(x = tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting, levels = my_levels)

## change the levels
levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting) <- list(A_1="9", 
                                      A_2="4", 
                                      A_3="15", 
                                      A_4 = "8", 
                                      A_5 = "1",
                                      A_6=  "14", 
                                      A_7 = "2", 
                                      A_8 = "10",
                                      A_9 = "3",
                                      A_10 = "0",
                                      A_11 = "6",
                                      A_12 = "5",
                                      A_13 = "7",
                                      A_14 = "12",
                                      A_15 = "18",
                                      A_16 = "20",
                                      A_17 = "23",
                                      P = "11",
                                      M_1 = "16",
                                      M_2 = "13",
                                      F_1 = "21",
                                      F_2 = "22",
                                      F_3 = "19",
                                      F_3 = "17"
                                      )

### Annotation set-up

## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex", "seurat_clusters_dot_plotting", "Prediction.Spearman.", "Prediction.Spearman._Kasia")]

## make a new column
df_pt_id$colour <- NA

## assign bins to each of the values
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values 
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))

## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))

## assign values to each cluster
## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "colour"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
## BECAUSE we have ordered the clusters already, we can simply take the row index for this bit
df_annotation$colour <- NA
df_annotation[1:17, ]$colour <- asex_ramp(100)[df_annotation[1:17, ]$x]
df_annotation[18, ]$colour <- bipot_ramp(100)[df_annotation[18, ]$x]
df_annotation[19:20, ]$colour <- male_ramp(100)[df_annotation[19:20, ]$x]
df_annotation[21:23, ]$colour <- female_ramp(100)[df_annotation[21:23, ]$x]

## make colour pallete
#pal_plot <- c(asex_ramp(170)[1,11,21,31,], bipot_ramp(3)[2], male_ramp(5)[3:4], female_ramp(5)[2:5])
pal_plot <- c(df_annotation$colour, '#49C16DFF')

## plot
umap_with_clusters <- DimPlot(tenx.justwt.integrated,
        group.by = "seurat_clusters_dot_plotting",
        dims = c(1,2), 
        reduction = "umap", 
        pt.size = 1,
        label.box = TRUE,
        label.size = 8,
        label.color	= c(rep(c("#000000"), 12), rep(c("#ffffff"), 5), rep(c("#000000"), 2),"#ffffff", "#000000", "#ffffff", "#ffffff"),
        order = TRUE, 
        repel = TRUE, 
        label = TRUE,
        cols = pal_plot) +
  scale_color_manual(labels = c("Asexual 1", "Asexual 2" , "Asexual 3", "Asexual 4", "Asexual 5", "Asexual 6", "Asexual 7", "Asexual 8",  "Asexual 9", "Asexual 10", "Asexual 11", "Asexual 12", "Asexual 13", "Asexual 14", "Asexual 15", "Asexual 16", "Asexual 17", "Progenitor", "Male 1", "Male 2", "Female 1", "Female 2", "Female 3"), values = pal_plot) +
  theme_void() +
                     coord_fixed()

umap_with_clusters
```

save
```{r}
ggsave("../images_to_export/umap_with_clusters.png", plot = umap_with_clusters, device = "png", path = NULL, scale = 1, width = 21, height = 29.5, units = "cm", dpi = 300, limitsize = TRUE)
```

### dotplot

A dotplot allows us to look at the expression of multiple genes in a clearer way than succesive UMAP plots

```{r, fig.width = 9, fig.height= 8}
### Data set-up

# PBANKA-0828000         GCSKO-3  GD1

# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-0413400    GCSKO-10_820  MD3
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-0102400         GCSKO-2  MD5 

# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-1435200        GCSKO-20  FD4 

# PBANKA-1437500 - AP2G - commitment
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)

marker_genes_list <- c("PBANKA-1437500", "PBANKA-1319500", "PBANKA-0416100", "PBANKA-0831000", "PBANKA-1102200")
mutant_genes_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0413400", "PBANKA-0716500", "PBANKA-0102400",  "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

## these get defined later on, but are replicated above here for plotting
asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11) 
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)

## copy the clusters so you don't permanently edit the master
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- tenx.justwt.integrated@meta.data$seurat_clusters

## reorder the levels so you can plot the cluters as you wish
my_levels <- c(asexual_early_clusters,asexual_late_clusters, bipotential_clusters, male_clusters, female_clusters)

## reorder the levels
tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting <- factor(x = tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting, levels = my_levels)

## rename clusters so that they are intuitive
# this trick is from here: http://www.cookbook-r.com/Manipulating_data/Renaming_levels_of_a_factor/
levels(tenx.justwt.integrated@meta.data$seurat_clusters_dot_plotting) <- list(Asexual_1="9", 
                                                                              Asexual_2="4", 
                                                                              Asexual_3="15", 
                                                                              Asexual_4 = "8", 
                                                                              Asexual_5 = "1",
                                                                              Asexual_6=  "14", 
                                                                              Asexual_7 = "2", 
                                                                              Asexual_8 = "10",
                                                                              Asexual_9 = "3",
                                                                              Asexual_10 = "0",
                                                                              Asexual_11 = "6",
                                                                              Asexual_12 = "5",
                                                                              Asexual_13 = "7",
                                                                              Asexual_14 = "12",
                                                                              Asexual_15 = "18",
                                                                              Asexual_16 = "20",
                                                                              Asexual_17 = "23",
                                                                              Bipotential = "11",
                                                                              Male_1 = "16",
                                                                              Male_2 = "13",
                                                                              Female_1 = "21",
                                                                              Female_2 = "22",
                                                                              Female_3 = "19",
                                                                              Female_3 = "17"
                                                                              )

### Annotation set-up

## extract pseudotime numbers and identity of cells to a dataframe
df_pt_id <- tenx.justwt.integrated@meta.data[,c("old_pt_values", "monocle_sex", "seurat_clusters_dot_plotting", "Prediction.Spearman.", "Prediction.Spearman._Kasia")]

## make a new column
df_pt_id$colour <- NA

## assign bins to each of the values
## help here: https://stackoverflow.com/questions/9946630/colour-points-in-a-plot-differently-depending-on-a-vector-of-values 
df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Asexual_Early" | df_pt_id$monocle_sex == "Asexual_Late", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Male", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Male", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Female", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Female", ]$old_pt_values,breaks = 100))

df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$colour <- as.numeric(cut(df_pt_id[df_pt_id$monocle_sex == "Bipotential", ]$old_pt_values,breaks = 100))

## make colour ramps
asex_ramp <- colorRampPalette(c("#D5E3F5", "#0052c5"))
male_ramp <- colorRampPalette(c("white", "yellow", "#016c00"))
female_ramp <- colorRampPalette(c("yellow", "#a52b1e"))
bipot_ramp <- colorRampPalette(c("white", "#ffe400"))

## assign values to each cluster
## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "colour"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
## BECAUSE we have ordered the clusters already, we can simply take the row index for this bit
df_annotation$colour <- NA
df_annotation[1:17, ]$colour <- asex_ramp(100)[df_annotation[1:17, ]$x]
df_annotation[18, ]$colour <- bipot_ramp(100)[df_annotation[18, ]$x]
df_annotation[19:20, ]$colour <- male_ramp(100)[df_annotation[19:20, ]$x]
df_annotation[21:23, ]$colour <- female_ramp(100)[df_annotation[21:23, ]$x]

## plot annotation 
## this really helped: https://www.biostars.org/p/396810/
h2 <- ggplot(df_annotation)+
  geom_bar(mapping = aes(x = Group.1, y = 1), 
           stat = "identity",
           fill = df_annotation$colour,
           col = "#FFFFFF",
           width = 1)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))#+
     #facet_grid(.~colour, scales = "free_x")
     #legend <- plot_grid(get_legend(h2), get_legend(h1), ncol = 1)
 h2 <- h2 + theme(legend.position = "none")
 #   geom_point(col = df_umap_plot$colour)
 
 h1 <- ggplot(df_annotation)+
  geom_point(mapping = aes(x = Group.1, y = 1),
            col = df_annotation$colour,
            size = 5)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))#+
     #facet_grid(.~colour, scales = "free_x")
     #legend <- plot_grid(get_legend(h2), get_legend(h1), ncol = 1)
 h1 <- h1 + theme(legend.position = "none")
 #   geom_point(col = df_umap_plot$colour)

 ## add predicted time point
 ## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "Prediction.Spearman."], list(df_pt_id$seurat_clusters_dot_plotting), mean)
df_annotation$colour <- NA
df_annotation$colour <- viridis(300)[(df_annotation$x)*10]

h3 <- ggplot(df_annotation)+
  geom_bar(mapping = aes(x = Group.1, y = 1), 
           stat = "identity",
           fill = df_annotation$colour,
           col = "#FFFFFF",
           width = 1)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))#+
     #facet_grid(.~colour, scales = "free_x")
     legend_h3 <- get_legend(h3)
 h3 <- h3 + theme(legend.position = "none")

  h3 <- ggplot(data = df_annotation, 
              mapping = aes(x = Group.1, fill=x)
              ) +
       geom_bar(col = "#FFFFFF", width = 1)+
       theme_void() +
       theme(panel.spacing.x = unit(1, "mm")) +
       scale_fill_viridis(option = 'cividis')
 legend_h3 <- get_legend(h3)
 h3 <- h3 + theme(legend.position = "none")
 
 ## add kasia data
  ## take the mean of the bin
df_annotation <- aggregate(df_pt_id[, "Prediction.Spearman._Kasia"], list(df_pt_id$seurat_clusters_dot_plotting), mean)
df_annotation$colour <- NA
df_annotation$colour <- viridis(300)[(df_annotation$x)*10]

h4 <- ggplot(df_annotation)+
  geom_bar(mapping = aes(x = Group.1, y = 1), 
           stat = "identity",
           fill = df_annotation$colour,
           col = "#FFFFFF",
           width = 1)+
      theme_void()+
      theme(panel.spacing.x = unit(1, "mm"))+
     facet_grid(.~colour, scales = "free_x")
     legend <- plot_grid(get_legend(h4), ncol = 1)
    legend_h4 <- get_legend(h4)
 h4 <- h4 + theme(legend.position = "none")

 h4 <- ggplot(data = df_annotation, 
              mapping = aes(x = Group.1, fill=x)
              ) +
       geom_bar(col = "#FFFFFF", width = 1)+
       theme_void() +
       theme(panel.spacing.x = unit(1, "mm")) +
       scale_fill_viridis()
 legend_h4 <- get_legend(h4)
 h4 <- h4 + theme(legend.position = "none")

### Plot
dot_plot_markers <- DotPlot(tenx.justwt.integrated, 
                            features = c(marker_genes_list, rev(mutant_genes_list)), 
                            group.by = "seurat_clusters_dot_plotting", 
                            dot.min = 0.00001,
                            assay = 'RNA') +
  theme_classic() +
  coord_fixed() +
  coord_flip() +
  # change appearance and remove axis elements, and make room for arrows
  theme(axis.text.x = element_text(size=16, angle = 45, hjust=1,vjust=1, family = "Arial"), 
        axis.text.y = element_text(size=16, face="italic"), 
        text=element_text(size=16, family="Arial", colour="black"), 
        legend.position = "bottom", 
        legend.direction = "horizontal", 
        legend.box = "vertical", 
        plot.title = element_blank(), 
        plot.margin = unit(c(1,3,1,3), "lines"), 
        axis.line = element_blank(),
        panel.border = element_rect(colour = "black", fill=NA, size=0.5)) +
  #change the colours
  #scale_colour_viridis(option = "inferno", guide = "colourbar", na.value="white", begin = 0, end = 1, direction = 1) +
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## change x axis label
  # labs(x = "Marker Genes", y = "Cluster", title = "Expression of Marker Genes by Cluster") 
  labs(x = "", y = "", title = "") +
  ## add arrows
  #annotate("segment", x = 5.5, xend = 5.5, y = 21.5, yend = 25, colour = "green", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
  #annotate("segment", x = 5.5, xend = 5.5, y = 16.5, yend = 21.5, colour = "red", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
  #annotate("segment", x = 5.5, xend = 5.5, y = 0, yend = 15.5, colour = "grey", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
  ## annotate asex
  geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
  ## annotate bipotential
  geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters, bipotential_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
  ## annotate sexes
  geom_hline(aes(yintercept = (length(c(asexual_early_clusters, asexual_late_clusters, bipotential_clusters, male_clusters))+0.5)), size = 0.5, linetype= 'dashed') +
  ## change label on bottom of plot so we can indicate markers
  scale_x_discrete(labels = rev(c("gd1", "md1", "md2", "md3", "md4", "md5", "fd1", "fd2", "fd3", "fd4", "msp8", "msp1", "mg1", "ccp2", "ap2g"))) +
  ## change label on bottom of plot so we can indicate markers
  scale_y_discrete(labels = c("Asexual 1", "Asexual 2" , "Asexual 3", "Asexual 4", "Asexual 5", "Asexual 6", "Asexual 7", "Asexual 8",  "Asexual 9", "Asexual 10", "Asexual 11", "Asexual 12", "Asexual 13", "Asexual 14", "Asexual 15", "Asexual 16", "Asexual 17", "Progenitor", "Male 1", "Male 2", "Female 1", "Female 2", "Female 3")) +
  ## change name of legends
  guides(col=guide_colorbar(title = 'Scaled Average Expression'),
         size=guide_legend("% of cells expressing"))

## view
#print(dot_plot_markers)

plot <- plot_grid(h4, h3, h1, dot_plot_markers, align = "v", ncol = 1, axis = "tb", rel_heights = c(0.5, 0.5, 0.5, 18)) 

dot_plot <- plot_grid(plot, legend_h3, legend_h4, nrow = 1, rel_widths = c(10, 1.5, 1.5))

dot_plot
```

save
```{r}
ggsave("../images_to_export/dot_plot_all.png", plot = dot_plot, device = "png", path = NULL, scale = 1, width = 29.7, height = 21, units = "cm", dpi = 300, limitsize = TRUE)
```

# 9. Subset sexual cells {.tabset}

Make a subsetted Seurat object of sexual cells. 

Include the pre-branch too as well as any weird clusters that may have clustered out. 

it's been a while since we looked at the clusters so let's check them out again:
```{r, fig.height = 10, fig.width = 10}
## Plot
DimPlot(tenx.justwt.integrated, label = TRUE, repel = FALSE, pt.size = 0.05, group.by = "seurat_clusters", dims = c(1,2), reduction = "umap") + coord_fixed()

## plot
list_UMAPs_by_cluster[[1]] + list_UMAPs_by_cluster[[2]] + list_UMAPs_by_cluster[[3]] + list_UMAPs_by_cluster[[4]] + list_UMAPs_by_cluster[[5]] + list_UMAPs_by_cluster[[6]] + list_UMAPs_by_cluster[[7]] + list_UMAPs_by_cluster[[8]] + list_UMAPs_by_cluster[[9]] + list_UMAPs_by_cluster[[10]] + list_UMAPs_by_cluster[[11]] + list_UMAPs_by_cluster[[12]] + list_UMAPs_by_cluster[[13]] + list_UMAPs_by_cluster[[14]] + list_UMAPs_by_cluster[[15]] + list_UMAPs_by_cluster[[16]] + list_UMAPs_by_cluster[[17]] + list_UMAPs_by_cluster[[18]] + list_UMAPs_by_cluster[[19]] + list_UMAPs_by_cluster[[20]] + list_UMAPs_by_cluster[[21]] + list_UMAPs_by_cluster[[22]] + list_UMAPs_by_cluster[[23]] + list_UMAPs_by_cluster[[24]]
```

### Define cells and subset
```{r}
asexual_early_clusters <- c(9, 4, 15, 8, 1, 14, 2, 10, 3, 0, 6, 5)
asexual_late_clusters <- c(7, 12, 18, 20, 23)
bipotentional_early_clusters <- # 0?
bipotential_clusters <- c(11) 
male_clusters <- c(16, 13)
female_clusters <- c(21, 22, 17, 19)

## define cells
cell_names_subset_monocle_ids <- rownames(tenx.justwt.integrated@meta.data[tenx.justwt.integrated@meta.data$monocle_sex %in% c("Asexual_Early", "Bipotential", "Male", "Female"), ])

## 3, 0, 6, 5 are early clusters of asexuals before the branch
cell_names_subset_cluster_ids <- rownames(tenx.justwt.integrated@meta.data[tenx.justwt.integrated@meta.data$seurat_clusters %in% c(male_clusters, female_clusters, bipotential_clusters, 3, 0, 6, 5), ])

cell_names_subset_intersect <- intersect(cell_names_subset_monocle_ids, cell_names_subset_cluster_ids)

## subset cells into new object
tenx.justwt.integrated.sex <- subset(tenx.justwt.integrated, cells = cell_names_subset_intersect)
```

### inspect/check
```{r}
## inspect object
tenx.justwt.integrated.sex

## look at original UMAP
DimPlot(tenx.justwt.integrated.sex, label = TRUE, repel = TRUE, pt.size = 0.1, split.by = "experiment", dims = c(1,2), reduction = "umap") + coord_fixed()
```

### Remove contaminant asexual cells

we want to remove:
```{r}
## look at original UMAP
plot_sexual_subsetting <- DimPlot(tenx.justwt.integrated.sex, label = TRUE, repel = TRUE, pt.size = 0.1, dims = c(1,2), reduction = "umap") + 
  coord_fixed() + 
  geom_vline(aes(xintercept = -1, alpha = 5))

plot_sexual_subsetting
```

```{r}
## extract cell embeddings
df_sex_cell_embeddings <- as.data.frame(tenx.justwt.integrated.sex@reductions[["umap"]]@cell.embeddings)

## subset anything lower than -0.8 in UMAP 2 and -0.1 in UMAP 1
remove_cells <- row.names(df_sex_cell_embeddings[which(df_sex_cell_embeddings$UMAP_1 < -1), ])

## plot these cells
DimPlot(tenx.justwt.integrated.sex, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = remove_cells, dims = c(1,2), reduction = "umap") + 
  coord_fixed() + 
  scale_color_manual(values=c("#000000", "#f54e1e")) + 
  theme_void() + 
  labs(title = paste("cells highlighted will be removed")) + 
  theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
```

## Final Subset
```{r}
## make keep cells from the remove_cells
## make the not in function
'%ni%' <- Negate('%in%')
keep_cells <- colnames(tenx.justwt.integrated.sex)[which(colnames(tenx.justwt.integrated.sex) %ni% remove_cells)]

## subset
tenx.justwt.integrated.sex <- subset(tenx.justwt.integrated.sex, cells = keep_cells)

## inspect
tenx.justwt.integrated.sex
```

copy old clusters over
```{r}
## copy old clusters
tenx.justwt.integrated.sex <- AddMetaData(tenx.justwt.integrated.sex, tenx.justwt.integrated.sex@meta.data$seurat_clusters, col.name = "post_integration_clusters")
```

# 10. Sex assignment of mutants 

## A. Seurat Method

```{r}
## find transfer anchors
DefaultAssay(tenx.justwt.integrated) <- "integrated"
merge.anchors <- FindTransferAnchors(reference = tenx.justwt.integrated, query = GCSKO_mutants, 
    dims = 1:30)

## transfer data between ref and query
predictions <- TransferData(anchorset = merge.anchors, refdata = tenx.justwt.integrated$seurat_clusters_dot_plotting, 
    dims = 1:30)

## add meta data to object
GCSKO_mutants <- AddMetaData(GCSKO_mutants, metadata = predictions)

## new object from this
mutant_seurat <- GCSKO_mutants

## look at breakdown of mutant by designation
df <- as.data.frame(mutant_seurat@meta.data)
table(df$predicted.id, df$identity_name_updated)
```

```{r}
## see how this overlaps with fluorescence sorted on for 820 to confirm accuracy
df_820 <- df[df$genetic_background =="PBANKA_820", ]
table(df_820$predicted.id, df_820$fluoresence_sorted_on)
```

inspect gd1
```{r}
df[which(df$predicted.id == "Female_1" & df$identity_name_updated == "gd1"), ][, 118:146]
```


Calculate sex ratios
```{r}
## use designations above for sexes
male_cells <- rownames(mutant_seurat@meta.data[mutant_seurat@meta.data$predicted.id == "Male", ])
female_cells <- rownames(mutant_seurat@meta.data[mutant_seurat@meta.data$predicted.id == "Female", ])
ss2_mutants_final_male <- subset(mutant_seurat, cells = male_cells)
ss2_mutants_final_female <- subset(mutant_seurat, cells = female_cells)

## inspect
ss2_mutants_final_male
ss2_mutants_final_female
```

```{r}
## calculate sex ratios
##subset out H, sorted cells:
df_male <- ss2_mutants_final_male@meta.data[ss2_mutants_final_male@meta.data$exclude_for_sex_ratio == FALSE,]

dim(df_male)

df_female <- ss2_mutants_final_female@meta.data[ss2_mutants_final_female@meta.data$exclude_for_sex_ratio == FALSE,]

dim(df_female)

## make dataframe
df_sex_ratio <- merge(
  as.data.frame(table(df_male$sub_name_updated)), 
  as.data.frame(table(df_female$sub_name_updated)), 
  by = "Var1", all=TRUE)

# or use identity_updated

## add names
names(df_sex_ratio) <- c("genotype", "male", "female")

## change the NAs to 0
df_sex_ratio[is.na(df_sex_ratio)] <- 0

## collapse 820 wild-types together
combined_m <- df_sex_ratio[df_sex_ratio$genotype == "WT-820_3_5", ]$male + df_sex_ratio[df_sex_ratio$genotype == "WT-820", ]$male
combined_f <- df_sex_ratio[df_sex_ratio$genotype == "WT-820_3_5", ]$female + df_sex_ratio[df_sex_ratio$genotype == "WT-820", ]$female
df_sex_ratio <- rbind(df_sex_ratio, c("WT-820-combined", combined_m, combined_f))
# remove old rows
df_sex_ratio <- df_sex_ratio[-which(df_sex_ratio$genotype == "WT-820_3_5" | df_sex_ratio$genotype == "WT-820"), ]
# need to make numeric again
df_sex_ratio$male <- as.numeric(df_sex_ratio$male)
df_sex_ratio$female <- as.numeric(df_sex_ratio$female)
df_sex_ratio$genotype <- as.character(df_sex_ratio$genotype)
# add name for WT combined
df_sex_ratio$genotype[17] <- "WT-820-combined"

## calculate sex ratio
df_sex_ratio$sex_ratio <- (df_sex_ratio$male + 0.1)/(df_sex_ratio$female + 0.1)

## log sex ratio
df_sex_ratio$sex_ratio_log <- log10(df_sex_ratio$sex_ratio)

##view
df_sex_ratio
```

```{r}
## remove WT-2 because it is really inappropriate to have a sex ratio for this:
df_sex_ratio <- df_sex_ratio[-which(df_sex_ratio$genotype == "WT-md5"),]
```

plot
```{r, fig.height = 7, fig.width = 6}
library(latex2exp) # so you can plot the fraction, it's on CRAN `install.packages("latex2exp")`

## make extra column for plotting aesthetics:
df_sex_ratio$above <- df_sex_ratio$sex_ratio_log > 0

## make extra column with ratio in it:
df_sex_ratio$genotype_with_n <- paste0(df_sex_ratio$genotype, " (", df_sex_ratio$male, "/", df_sex_ratio$female, ")")

## reorder genotype so it is in the correct order for plotting
df_sex_ratio$genotype_with_n <- factor(df_sex_ratio$genotype_with_n, levels = df_sex_ratio$genotype_with_n[order(df_sex_ratio$sex_ratio_log)])

## plot
sex_ratio_plot <- ggplot(df_sex_ratio, aes(sex_ratio_log, genotype_with_n, color = above)) +
      ## add the lines for the lollipop plot
      geom_segment(aes(x = 0, y = genotype_with_n, xend = sex_ratio_log, yend = genotype_with_n), color = "grey50") +
      ## add the points for the lollipop plot
      geom_point(aes(size = 4)) +
      ## add the wild-type rectangle
      annotate("rect", xmin= -0.23182478, xmax = 1.00105797, ymin=-Inf , ymax=Inf, alpha=0.4, color=NA,linetype = 2, fill="#999999") +
      ## make prettier
      theme_classic() +
      theme(legend.position = "none", text=element_text(size=16, family="Arial")) + 
      ## change axis labels
      labs(y = TeX("$Genotype\ \\left(\\frac{n_{male}}{n_{female}}\\right)$")) +
      #labs(y = expression(paste("Genotype", group("(", frac(paste("n male"), "n female"), ")")))) +
      xlab(expression(paste("Sex Ratio (", log[10], 
                               group("(",
                                      frac(paste("n male + 0.1"), 
                                           paste("n female + 0.1")),
                                   ")"), ")" ))) +
      ## change colours of lollipops
      scale_colour_manual(values = c("#a52b1e", "#016c00")) +
      ## annotate phenotypes
      geom_hline(aes(yintercept = 5.5)) #+
      #geom_hline(aes(yintercept = 14.5))

print(sex_ratio_plot)

### CHANGE TO BIG BRACKET HERE: https://www.overleaf.com/learn/latex/Brackets_and_Parentheses 

#paste("Sex Ratio", "\n", "log10((n male + 0.1)/(n female + 0.1))")
#theme(legend.position = "none", text=element_text(size=16, family="Arial"))

## fraction titles: https://groups.google.com/forum/#!topic/ggplot2/bgNRnZ82hJY 
## https://uc-r.github.io/lollipop

#ggsave(filename = "../images_to_export/sex_ratio_plot.png", device = "png", width = 7, height = 7, units = "in")
```

## B. SCMAP Method

### Build the index

```{r}
### Making an ortholog reference index

## load in mca data
#counts <- read.csv("../scmap/allpb10x_counts.csv", row.names = 1)
#pheno <- read.csv("../scmap/allpb10x_pheno.csv")
#ggplot(pheno, aes(x=PC2_3d, y = PC3_3d,colour=absclust3)) + geom_point()

## load required libraries
library(scmap) #https://bioconductor.org/packages/release/bioc/html/scmap.html 
library(SingleCellExperiment) #

#prep the SCE, if was originally a Suerat object need the dfs to be regular matrices
#pb_filtered_sce_orth <- pb_filtered_sce_orth[, colData(pb_filtered_sce_orth)$absclust3 != "8"]
#sce <- pb_filtered_sce_orth
#pca <- plotPCA(sce)
#pcs <- pca$data
#table(rownames(pcs)==colnames(sce))
#colData(sce) <- cbind(colData(sce), pcs)
#rowData(sce)$feature_symbol <- rowData(sce)$gene

## extract data from Seurat
#cells_tenx <- rownames(tenx.justwt.integrated@meta.data[which(tenx.justwt.integrated@meta.data$experiment == "tenx_5k"), ])
#tenx.justwt.integrated.10k <- subset(tenx.justwt.integrated, cells = cells_tenx)
counts = as.matrix(GetAssayData(tenx.justwt.integrated, slot = "counts", assay = "RNA"))
pheno = as.data.frame(tenx.justwt.integrated@meta.data)

## add UMAP coordinates
df_sex_cell_embeddings <- as.data.frame(tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings)
pheno <- cbind(pheno, df_sex_cell_embeddings)

## Set up object
sce <- SingleCellExperiment(list(counts=counts),
    colData=DataFrame(label=pheno),
    rowData=DataFrame(feature_symbol=rownames(counts)))
sce

## manual logging
#counts_1 <- assay(sce, "counts")
#libsizes <- colSums(counts_1)
#size.factors <- libsizes/mean(libsizes)
#logcounts(sce) <- log2(t(t(counts_1)/size.factors) + 1)
#counts(sce) <- as.matrix(counts(sce))
#logcounts(sce) <- as.matrix(logcounts(sce))
logcounts(sce) <- as.matrix(GetAssayData(tenx.justwt.integrated, slot = "data", assay = "RNA"))

## remove features with duplicated names
sce <- sce[!duplicated(rownames(sce)), ]

## build scmap-cell reference index, save this rds
sce <- selectFeatures(sce, suppress_plot = FALSE, n_features = 2000)
table(rowData(sce)$scmap_features)
set.seed(1)
sce <- indexCell(sce, M = 50, k = 80)
names(metadata(sce)$scmap_cell_index)
#length(metadata(sce)$scmap_cell_index$subcentroids)
#dim(metadata(sce)$scmap_cell_index$subcentroids[[1]])
#metadata(sce)$scmap_cell_index$subcentroids[[1]][,1:5]
#dim(metadata(sce)$scmap_cell_index$subclusters)

#saveRDS(pb_filtered_sce_orth, file="pb_filtered_sce_orthindex_20181109.rds")
```

### Map to the index

```{r}
#rowData(pb_filtered_sce_orth)$feature_symbol <- rowData(pb_filtered_sce_orth)$orth_name
#rownames(pb_filtered_sce_orth) <- rowData(pb_filtered_sce_orth)$orth_name

#prep the SCE, if was originally a Suerat object need the dfs to be regular matrices
#pb_filtered_sce_orth <- pb_filtered_sce_orth[, colData(pb_filtered_sce_orth)$absclust3 != "8"]
#sce <- pb_filtered_sce_orth
#pca <- plotPCA(sce)
#pcs <- pca$data
#table(rownames(pcs)==colnames(sce))

#colData(sce) <- cbind(colData(sce), pcs)

mutants_counts = as.matrix(GetAssayData(mutant_seurat, slot = "counts", assay = "RNA"))
mutants_pheno = as.data.frame(mutant_seurat@meta.data)

#rowData(sce)$feature_symbol <- rowData(sce)$gene
mutants.sce <- SingleCellExperiment(list(counts=mutants_counts),
    colData=DataFrame(label=mutants_pheno),
    rowData=DataFrame(feature_symbol=rownames(mutants_counts)))
mutants.sce

## add log counts
## manual
#counts_1 <- assay(mutants.sce, "counts")
#libsizes <- colSums(counts_1)
#size.factors <- libsizes/mean(libsizes)
#logcounts(mutants.sce) <- log2(t(t(counts_1)/size.factors) + 1)
## from seurat
logcounts(mutants.sce) <- as.matrix(GetAssayData(mutant_seurat, slot = "data", assay = "RNA"))

#Project query data set onto cell index
scmapCell_results <- scmapCell(
  mutants.sce, 
  list(wt = metadata(sce)$scmap_cell_index
  )
)

##Look into the results
# For each dataset there are two matricies. cells matrix contains the top 10 (scmap default) cell IDs of the cells of the reference dataset that a given cell of the projection dataset is closest to:
#   
#   Give assignments in two ways:
#   1. Take the top cell assignment abs clust, if cosine similarity is less than 0.4 (or adjust if needed) mark as unassigned
# 2. For the top 3 nearest neighbors, get a mean of the PCA coordinates and snap to the nearest cell of those coordinates. If any of the top three cells are sim below 0.4 then mark as unassigned.


##Top cell assignment method

## look at assignments
scmapCell_results$wt$cells[, 1:3]
## get cell indexes 
getcells <- scmapCell_results$wt$cells[1, ]
## exctract cells metadata from index
cdsce <- colData(sce)[getcells, ]
## extract similarities
topsim <- scmapCell_results$wt$similarities[1, ]
## get name of top cell matched
mutants.sce$topcell <- rownames(cdsce)
## get sex id
mutants.sce$topcell_ac <- cdsce$label.monocle_sex
## get coordinates of matched cell
mutants.sce$indexPC1 <- cdsce$label.UMAP_1
mutants.sce$indexPC2 <- cdsce$label.UMAP_2
#mutants.sce$pbpt <- cdsce$pseudotime
#mutants.sce$pbbulk <- cdsce$bulk
mutants.sce$topcell_sp <- mutants.sce$topcell_ac
mutants.sce$topsim <- topsim
mutants.sce$topcell_sp[mutants.sce$topsim < 0.4] <- "unassigned"
table(mutants.sce$topcell_sp)
```

```{r}
## see how this overlaps with fluorescence sorted on for 820 to confirm accuracy
df_820 <- colData(mutants.sce)[colData(mutants.sce)$label.genetic_background =="PBANKA_820", ]
table(df_820$topcell_sp, df_820$label.fluoresence_sorted_on)
```

```{r}
table(colData(mutants.sce)$topcell_sp, colData(mutants.sce)$label.identity_name_updated)
```


```{r}
#### TOP 3NN method

#This function makes a list of the PC means for each cell and then do.call below rbinds them into a dataframe called big_data

datalist = list()

for (i in colnames(scmapCell_results$wt$cells)) {
  
  getcellstest <- scmapCell_results$wt$cells[1:3, i]
  cdscetest <- colData(sce)[getcellstest, ]
  PC1mean <- mean(cdscetest$label.UMAP_1)
  PC2mean <- mean(cdscetest$label.UMAP_2)
  # ... make some data
  dat <- data.frame(i, PC1mean, PC2mean)
  dat$i <- i  # maybe you want to keep track of which iteration produced it?
  datalist[[i]] <- dat # add it to your list
}

big_data = do.call(rbind, datalist)
# or big_data <- dplyr::bind_rows(datalist)
# or big_data <- data.table::rbindlist(datalist)

test <- big_data[1, ]

df <- data.frame(X=colData(sce)$label.UMAP_1, Y=colData(sce)$label.UMAP_2, row.names = rownames(colData(sce)))

#the snap function snaps to the nearest cell in PC coordiantes
snap <- function(df, test){
  require(Biobase)
  d <- matchpt(as.matrix(df),
               as.matrix(data.frame(X=test$PC1mean,Y=test$PC2mean)))
  
  min_row <- rownames(d[d$distance==min(d$distance),])
  
  test$X_snap <- unique(df[min_row,"X"])
  test$Y_snap <- unique(df[min_row,"Y"])
  test$pb_cell <- min_row
  
  test
}

#this loops through each cell and in big_data and runs the snap function
datalist2 = list()
colnames(big_data) <- c("sample_id", "PC1mean", "PC2mean")
for (i in rownames(big_data)) {
  test <- big_data[i, ]
  coord <- snap(df, test)
  coord$i <- i
  datalist2[[i]] <- coord
}
big_data2 = do.call(rbind, datalist2)


table(rownames(big_data2)==rownames(colData(mutants.sce)))

allpbcd <- colData(sce)
allpbcd <- as.data.frame(allpbcd)
pbabsclust <- allpbcd[, c("label.pt_id_cols", "label.identity_name_updated", "label.monocle_sex"), drop=FALSE]
pbabsclust$pb_sample_id <- rownames(pbabsclust)

# Now merge the pc cell asignments with their abs clust and get in the right order
big_data3 <- merge(big_data2, pbabsclust, by.x = "pb_cell", by.y = "pb_sample_id", all.x=TRUE, all.y=FALSE)
big_data4 <- big_data3[match(rownames(big_data2), big_data3$sample_id), ]


colors <- c("6"="#78C679",
            "2"="#D1EC9F",
            "0"="#FEB24C",
            "1"="#F4CF63",
            "3"="#FEEEAA",
            "4"="#85B1D3",
            "7"="#9ecae1",
            "5"="#C9E8F1",
            "M"= "#B7B7D8",
            "F"="#9C96C6",
            "unassigned"="black")

ggplot(big_data4, aes(PC1mean, PC2mean)) +
  geom_point(col = big_data4$label.pt_id_cols) +
                     theme_void() +
                     coord_fixed()

#+ geom_point(aes(colour=factor(big_data4$label.label.pt_id_cols))) + scale_color_manual(values = colors) 

                     

ggplot(big_data4, aes(X_snap, Y_snap)) +
  geom_point(col = big_data4$label.pt_id_cols) +
                     theme_void() +
                     coord_fixed() 

#+ geom_point(aes(colour=factor(big_data4$label.label.pt_id_cols))) + scale_color_manual(values = colors) 

##add info to SCE and save colData, to be an assigned cell all 3NN must have a cos sim >0.4
scmapCell_results$wt$similarities[, 1:3]
topsim1 <- scmapCell_results$wt$similarities[1, ]
topsim2 <- scmapCell_results$wt$similarities[2, ]
topsim3 <- scmapCell_results$wt$similarities[3, ]

table(big_data4$sample_id==rownames(colData(mutants.sce)))
#pfobj$pb_cell <- big_data4$pb_cell
#pfobj$PC1mean <- big_data4$PC1mean
bd4 <- big_data4[, c("pb_cell", "sample_id", "PC1mean", "PC2mean", "X_snap", "Y_snap", "label.pt_id_cols", "label.identity_name_updated", "label.monocle_sex")]

colData(mutants.sce) <- cbind(colData(mutants.sce), bd4)

mutants.sce$topsim1 <- topsim1
mutants.sce$topsim2 <- topsim2
mutants.sce$topsim3 <- topsim3
mutants.sce$stage_pred <- big_data4$label.monocle_sex
mutants.sce$stage_pred[mutants.sce$topsim1 < 0.4 | mutants.sce$topsim2 < 0.4 | mutants.sce$topsim3 < 0.4] <- "unassigned"
table(mutants.sce$stage_pred)

#write.csv(bd4, "pfcellassignmentswithmean3nn_20181029.csv")
#write.csv(colData(pfobj), "pf3d7100scmapclusts2methodindexn100_20190107.csv")
```

```{r}
## see how this overlaps with fluorescence sorted on for 820 to confirm accuracy
df_820 <- colData(mutants.sce)[colData(mutants.sce)$label.genetic_background =="PBANKA_820", ]
table(df_820$stage_pred, df_820$label.fluoresence_sorted_on)
```

```{r}
table(colData(mutants.sce)$stage_pred, colData(mutants.sce)$label.identity_name_updated)
```


## make final plot with MCA data below
```{r}
## Read in MCA data
pheno <- read.csv("../scmap/allpb10x_pheno.csv")
## extract rows needed for plotting
df_plot <- pheno[,c("PC2_3d", "PC3_3d", "absclust3")]
df_plot$experiment <- df_plot$absclust3

## extract rows needed for plotting from mapped object
df_plot_pm <- as.data.frame(colData(pm_ss2_field.sce.orth))
df_plot_pm <- df_plot_pm[ ,c("X_snap", "Y_snap", "label.absclust3")]
df_plot_pm$experiment <- "pm"
colnames(df_plot_pm) <- c("PC2_3d", "PC3_3d", "absclust3", "experiment")

## extract rows needed for plotting from mapped object
df_plot_pf <- as.data.frame(colData(pf_ss2_field.sce.orth))
df_plot_pf <- df_plot_pf[ ,c("X_snap", "Y_snap", "label.absclust3")]
df_plot_pf$experiment <- "pf"
colnames(df_plot_pf) <- c("PC2_3d", "PC3_3d", "absclust3", "experiment")

## bind together
df_plot <- rbind(df_plot, df_plot_pf, df_plot_pm)

colors <- c("6"="#78C679",
            "2"="#D1EC9F",
            "0"="#FEB24C",
            "1"="#F4CF63",
            "3"="#FEEEAA",
            "4"="#85B1D3",
            "7"="#9ecae1",
            "5"="#C9E8F1",
            "M"= "#B7B7D8",
            "F"="#9C96C6",
            "unassigned"="black",
            "pm" = "#dc65a4",
            "pf" = "#24245f")

## add alpha for plotting
df_plot$alpha[!df_plot$experiment == "pm"] <- 0.5
df_plot$alpha[df_plot$experiment == "pm"] <- 1
df_plot$alpha[df_plot$experiment == "pf"] <- 1

##plot
scmap_pca <- ggplot(df_plot, aes(x=PC2_3d, y = PC3_3d,colour=experiment, alpha = alpha)) + 
  geom_point() +
  theme_void() +
  scale_color_manual(values = colors) +
  ## remove alpha scale
  scale_alpha_continuous(guide=FALSE)
  ## draw ring around field samples
  #geom_point(data=df_plot[df_plot$experiment == "pm", ], pch=21, fill=NA, size=2, colour="black", stroke=1) +
  ## make non-field samples more opaque
  #geom_point(data=df_plot[-df_plot$experiment == "pm", ], alpha = 0.2)

## view
scmap_pca_2 <- scmap_pca + 
  guides(colour=guide_legend(override.aes = list(size=4)))

scmap_pca_2
```

save
```{r}
#ggsave("../images_to_export/field_samples_scmap.png", plot = scmap_pca_2, device = "png", path = NULL, scale = 1, width = 15, height = 10, units = "cm", dpi = 300, limitsize = TRUE)
```

## Overlap and validation

```{r}

```

```{r}

```

# 11. Save and Export {.tabset}

## save

Save environment
```{r}
## This saves everything in the global environment for easy recall later
#save.image(file = "GCSKO_merge.RData")
#load(file = "GCSKO_merge.RData")
```

Save object(s)
```{r}
## Save an object to a file
saveRDS(tenx.justwt.integrated.sex, file = "../data_to_export/tenx.justwt.integrated.sex.RDS")
## Restore the object
#readRDS(file = "../data_to_export/tenx.mutant.integrated.sex.RDS")

## save integrated object to file
saveRDS(tenx.justwt.integrated, file = "../data_to_export/tenx.justwt.integrated.RDS") 
## restore the object
#tenx.justwt.integrated <- readRDS("../data_to_export/tenx.justwt.integrated.RDS")

## save monocle object
saveRDS(monocle.object.all, file = "../data_to_export/monocle.object.all.RDS") 
## restore the object
#monocle.object.all <- readRDS("../data_to_export/monocle.object.all.RDS")

## save integrated object to file
#saveRDS(GCSKO_mutants, file = "../data_to_export/GCSKO_mutants.RDS") 
## restore the object
#tenx.justwt.integrated <- readRDS("../data_to_export/tenx.justwt.integrated.RDS")

## extract UMAP coordinates and metadata
## extract metadata
mca_website_df <- tenx.justwt.integrated@meta.data

## add umap - merge on row names and then make sure row names are not dropped or added
mca_website_df <- transform(merge(mca_website_df, tenx.justwt.integrated@reductions[["umap"]]@cell.embeddings, by=0, all=TRUE), row.names=Row.names, Row.names=NULL)

## send this to sunil:
# ggplot(mca_website_df, aes(x = UMAP_1, y = UMAP_2)) +
# geom_point(col = mca_website_df$pt_id_cols) +
# theme_void() +
# coord_fixed()

## export
write.csv(mca_website_df, file = "../data_to_export/mca_website_df.csv")
```

Clean up
```{r}
#rm(ss2_wt_cells)
#rm(tenx.justwt.integrated)
#rm(tenx.justwt.list)
```

## final figure construction

[Cowplot](https://wilkelab.org/cowplot/articles/plot_grid.html)(plot_grid), [patchwork](https://cran.r-project.org/web/packages/patchwork/vignettes/patchwork.html)(wrap_plots), and [ggpubr](http://www.sthda.com/english/articles/24-ggpubr-publication-ready-plots/81-ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page/) can all allow multiple plots to be plotted together. 

```{r, fig.height = 11.7, fig.width = 8.3}
## A
# umap_id_pt
## B
# marker gene expression
## C
# Mutant gene expression
## D
# Modules

#Figure_A <- grid.arrange(arrangeGrob(QC_composite_plot, QC_mito_violin, QC_mito_graph, QC_by_genotype, mapping_rate_plot), nrow=3), nrow=2, heights=c(10,2))

## cowplot method
## can use this for labels: toupper(letters)[1:10]

## C. Mutant genes
mutant_genes_figure <- plot_grid(
                                 ## marker genes starts
                                 marker_gene_plot_FAMB,
                                 marker_gene_plot_MSP8,
                                 marker_gene_plot_MSP1,
                                 marker_gene_plot_AP2G,
                                 marker_gene_plot_CCP2,
                                 marker_gene_plot_MG1,
                                 ## mutant genes starts  
                                 marker_gene_plot_gd1,
                                 marker_gene_plot_md1,
                                 marker_gene_plot_md2,
                                 marker_gene_plot_md3,
                                 marker_gene_plot_md4, 
                                 marker_gene_plot_md5,
                                 marker_gene_plot_fd1,
                                 marker_gene_plot_fd2,
                                 marker_gene_plot_fd3,
                                 marker_gene_plot_fd4, 
  label_size = 1, 
  nrow = 4)

## labels as uppercase letters:
#c(toupper(letters)[2:17])
# labels as roman numerals:
# c(tolower(as.roman(c(2:17))

Figure_publication <- plot_grid(umap_id_pt + theme(plot.margin = unit(c(0, 0, 0, 0), "cm")), 
                                mutant_genes_figure,
                                ## add empty plot to give spacing
                                ggplot() + theme_void(),
                                labels = c('A', 'B'), 
                                label_size = 12, 
                                ncol = 2, 
                                nrow=2, 
                                rel_heights = c(1, 1, 4), 
                                rel_widths = c(1, 2, 3))

Figure_publication
```

save
```{r}
ggsave("../images_to_export/Figure_C.png", plot = Figure_publication, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
```

```{r, fig.height = 11.7, fig.width = 8.3}
layout <- "
AAABBB
CCDDEE
FFGGHH
IIJJKK
"

Figure_publication_A <- (composition_umap_10x + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10))) +
(composition_umap_ss2 + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10))) + 
plot_layout(ncol = 2)

Figure_publication <- Figure_publication_A +
#(UMAP_hoo + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10)) + labs(title="Predicted Asexual Cycle Real Timepoint")) +
#(UMAP_kasia + theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 10)) + labs(title="Predicted Gametocytogenesis Real Timepoint")) +
  ## marker genes
marker_gene_plot_FAMB +
marker_gene_plot_MSP8 +
marker_gene_plot_MSP1 +
marker_gene_plot_CDPK5 +
marker_gene_plot_AP2G +
marker_gene_plot_CCP2 +
marker_gene_plot_MG1 +
marker_gene_plot_ORC1 +
marker_gene_plot_MCM4 +
  ## mutant genes starts  
#marker_gene_plot_gd1 +
#marker_gene_plot_md1 +
#marker_gene_plot_md2 +
#marker_gene_plot_md3 +
#marker_gene_plot_md4 + 
#marker_gene_plot_md5 +
#marker_gene_plot_fd1 +
#marker_gene_plot_fd2 +
#marker_gene_plot_fd3 +
#marker_gene_plot_fd4 + 
plot_layout(ncol = 6, 
            widths = c(1, 1, 1, 1, 1 ,1),
            heights = c(2, 1, 1, 1),
            design = layout
            )

Figure_publication
```

save
```{r}
ggsave("../images_to_export/Figure_S2.png", plot = Figure_publication, device = "png", path = NULL, scale = 1, width = 21, height = 29.7, units = "cm", dpi = 300, limitsize = TRUE)
```

# Appendix {.tabset}

## Session Info 
```{r, echo = FALSE}
sessionInfo()
```

# Expression plots

#### Expression - raw - Viridis
```{r, fig.height = 15, fig.width = 15}
## find a good ring marker, to see if there is a better one than the ones reported
#markers_ring <- FindMarkers(tenx.justwt.integrated, ident.1 = c("4", "5", "16", "11", "7", "3", "9", "0", "22"))
#head(markers_ring)

# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/ 


marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1319500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(ccp2), "(Female)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0416100", coord.fixed = TRUE, 
                                    #min.cutoff = "q1", 
                                    dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(mg1), "(Male)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1437500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(ap2g), "(Commitment)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0831000", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(msp1), "(Schizont)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1102200", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(msp8), "(Asexual)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1101300", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(sbp1), "(Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0722600", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(fam-b2),  "(Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

marker_gene_plot_HSP70 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0711900", coord.fixed = TRUE, 
                                      #min.cutoff = "q1", 
                                      dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste(italic(HSP70),  "(Reporter)","\n", "PBANKA_0711900")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))

## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)

marker_gene_plot_all

## patchwork method
#marker_gene_plot_FAMB + marker_gene_plot_MSP8 + marker_gene_plot_MSP1 + marker_gene_plot_AP2G + marker_gene_plot_CCP2 + marker_gene_plot_MG1 + marker_gene_plot_HSP70
```

#### Expression - raw - purple
```{r, fig.height = 15, fig.width = 15}

marker_gene_ramp <- colorRampPalette(c("#D3D3D3", "#1D1564"))(50)

marker_gene_plot_CCP2 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1319500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("CCP2 (Female)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_MG1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0416100", coord.fixed = TRUE, 
                                    #min.cutoff = "q1", 
                                    dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("MG1 (Male)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_AP2G <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1437500", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("AP2G (Commitment)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_MSP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0831000", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("MSP1 (Schizont)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_MSP8 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1102200", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("MSP8 (Asexual)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_SBP1 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-1101300", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("SBP1 (Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_FAMB <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0722600", coord.fixed = TRUE, 
                                     #min.cutoff = "q1", 
                                     dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("Fam-b2 (Ring)")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

marker_gene_plot_HSP70 <- FeaturePlot(tenx.justwt.integrated, features = "PBANKA-0711900", coord.fixed = TRUE, 
                                      #min.cutoff = "q1", 
                                      dims = c(1,2), reduction = "umap", pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("(HSP70; Reporter)","\n", "PBANKA_0711900")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_colour_gradientn(colours=marker_gene_ramp)

## plot
## cowplot method
marker_gene_plot_all <- plot_grid(marker_gene_plot_FAMB, marker_gene_plot_MSP8, marker_gene_plot_MSP1, marker_gene_plot_AP2G, marker_gene_plot_CCP2, marker_gene_plot_MG1, marker_gene_plot_HSP70, nrow=3)

marker_gene_plot_all
```

#### Expression - data - purple

```{r, fig.height = 10, fig.width = 10}
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-1435200        GCSKO-20  FD4 
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-0413400    GCSKO-10_820  MD5
# PBANKA-0828000         GCSKO-3  GD1
# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-1144800        GCSKO-28  FD5


marker_gene_plot_17 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1418100", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_2 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0102400", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md3")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_19 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0716500", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_20 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1435200", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd4")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_13 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0902300", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_10 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0413400", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md5")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_3 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-0828000", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("gd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_oom <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1302700", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_29 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1447900", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("md2")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

marker_gene_plot_21 <- FeaturePlot(tenx.justwt.integrated, dims = c(1,2), reduction = "umap", features = "PBANKA-1454800", coord.fixed = TRUE,  pt.size = 1, order = TRUE) + 
  theme_void() + 
  labs(title = paste("fd1")) + 
  theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) + 
  scale_color_continuous_sequential(palette = "Purples 2") +
  ## add sex symbols
  annotate("text", x = 3.8, y = 1.5, label = male_symbol, size=7, color="gray") + 
  annotate("text", x = 2, y = 2.8, label = female_symbol, size=7, color="gray")

##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))

## make composite plot
mutant_expression_composite <- wrap_plots(marker_gene_plot_17 , marker_gene_plot_2 , marker_gene_plot_19 , marker_gene_plot_20 , marker_gene_plot_13 , marker_gene_plot_10 , marker_gene_plot_3 , marker_gene_plot_oom , marker_gene_plot_29 , marker_gene_plot_21 , ncol = 4)
           
## print
mutant_expression_composite
```

# Density Plots

#### Density - purple

```{r}
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/ 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_colour_gradientn(colours=marker_gene_ramp) + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_colour_gradientn(colours=marker_gene_ramp) + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=marker_gene_ramp) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)

UMAP_composite_mutant_genes
```


#### Density -viridis

```{r}
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker - discovered by looking for marker genes in data
# PBANKA-0722600 - Fam-b2 - ring marker - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5113031/ 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)) )+ guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)

UMAP_composite_mutant_genes
```

#### Density - viridis
```{r, fig.height = 10, fig.width = 10}
# PBANKA-0828000         GCSKO-3  GD1

# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-0413400    GCSKO-10_820  MD5

# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-1435200        GCSKO-20  FD4 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "ks")

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + coord_fixed() + theme_void() + labs(title = paste("md2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)) )+ guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)

UMAP_composite_mutant_genes

## IF putting in MAIN - list_of_density_plots_mutant_genes[[2]] + coord_fixed() + theme_void() + theme(plot.title = element_text(hjust = 0.5), legend.text = element_text(size = 8), text=element_text(size=9)) + labs(title = paste("md1")) + scale_colour_gradientn(colours=c("#DCDCDC", plasma(30))) + guides(colour = guide_colourbar(barwidth = 0.3, barheight = 4.0, title = ""))
```

```{r, fig.height = 10, fig.width = 10}
# PBANKA-0828000         GCSKO-3  GD1

# PBANKA-1302700       GCSKO-oom  MD1 
# PBANKA-1447900        GCSKO-29  MD2
# PBANKA-0102400         GCSKO-2  MD3 
# PBANKA-0716500        GCSKO-19  MD4 
# PBANKA-0413400    GCSKO-10_820  MD5

# PBANKA-1454800        GCSKO-21  FD1
# PBANKA-0902300        GCSKO-13  FD2
# PBANKA-1418100        GCSKO-17  FD3   
# PBANKA-1435200        GCSKO-20  FD4 

markers_list <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-0413400", "PBANKA-1454800", "PBANKA-0902300", "PBANKA-1418100", "PBANKA-1435200")

list_of_density_plots_mutant_genes <- plot_density(tenx.justwt.integrated, markers_list, joint = FALSE, combine = FALSE, dims = c(1,2), pal = "plasma", method = "wkde", slot = 'data')

## make composite plot
UMAP_composite_mutant_genes <- wrap_plots(list_of_density_plots_mutant_genes[[1]] + 
                                            coord_fixed() + 
                                            theme_void() + 
                                            labs(title = paste("gd1")) + 
                                            scale_color_continuous_sequential(palette = "Purples 2") + 
                                            guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[2]] + 
                               coord_fixed() + theme_void() + 
                               labs(title = paste("md1")) + 
                               scale_color_continuous_sequential(palette = "Purples 2") + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[3]] + 
                               coord_fixed() + 
                               theme_void() + 
                               labs(title = paste("md2")) + scale_color_continuous_sequential(palette = "Purples 2") + 
                               guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[4]] + coord_fixed() + theme_void() + labs(title = paste("md3")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[5]] + coord_fixed() + theme_void() + 
  labs(title = paste("md4")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[6]] + coord_fixed() + theme_void() + labs(title = paste("md5")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[7]] + coord_fixed() + theme_void() + labs(title = paste("fd1")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[8]] + coord_fixed() + theme_void() + labs(title = paste("fd2")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[9]] + coord_fixed() + theme_void() + labs(title = paste("fd3")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             list_of_density_plots_mutant_genes[[10]] + coord_fixed() + theme_void() + labs(title = paste("fd4")) + scale_color_continuous_sequential(palette = "Purples 2") + guides(colour = guide_colourbar(barwidth = 0.5, title = "")),
                             ncol = 4)

## old colour scale: scale_colour_gradientn(colours=marker_gene_ramp )

UMAP_composite_mutant_genes
```





